class: top, center, title-slide .title[ # .white[Sample Counts: Quantifying Uncertainty] ] .subtitle[ ## .white[EFB 390: Wildlife Ecology and Management] ] .author[ ### .white[Dr. Elie Gurarie] ] .date[ ### .white[September 20, 2022] ] --- <!-- https://bookdown.org/yihui/rmarkdown/xaringan-format.html --> ## The trickiest thing in sampling Is computing the *precision* (standard errors / confidence intervals) ![](images/MillerEstimator.png) --- ### General principle: The bigger the sample, the smaller the error. .pull-left[ `\(k\)` sample flames with counts `\(n_i\)`, each of area `\(a\)` out of large area `\(A\)` total area sampled is much less than total Area: `$$a_s = \sum_{i=1}^k a_i = k \times a \ll A$$` ] .pull-right[ then: $$\widehat{N} = {A \over a_s} \sum c_i = {A \over k \times a} \sum c_i $$ `$$SE(\widehat{N}) = {A \over a_s} \sqrt{\sum n_i} = {A \over a} {\sqrt{\sum n_i} \over k}$$` ] -- .pull-left-40[![](images/Pop2.png)] .pull-right-60.small[ .darkred[ - `\(\widehat{N} = {100 \times 100 \over 10 \times 10 \times 10} \times 21 = 210\)` - `\(SE(\widehat{N}) = {100 \times 100 \over 10 \times 10 \times 10} \sqrt{21} = 45.8\)` - `\(95\% \,\, C.I. = \widehat{N} \pm 1.96 \times SE(\widehat{N}) = \,\, ...\)` - Coefficient of Variation = `\({SE(\widehat{N}) \over \widehat{N}} = \,\, ...\)` ]] --- class: small ## Example - single transect, simple formula .center[ `\(\large SE(\widehat{D}) = {1 \over a}\sqrt{\sum n_i}\,\,\,\)` and `\(\large SE(\widehat{N}) = A \times SE(\widehat{D})\)` ] .pull-left[ .red.large.center[ `\(n = 8\)`; `\(a = 1000\)`; `\(A = 10,000\)` ] ![](Lecture06b_CountingAnimals_Uncertainty_files/figure-html/unnamed-chunk-3-1.png)<!-- --> ] -- .pull-right[ #### point estimates `$$\widehat{d} = 8/1,000 = .008$$` `$$\widehat{N} = \widehat{d} \times A = 80$$` #### standard errors: `$$SE(\widehat{D}) = {\sqrt{8} \over 1000} = 0.0028$$` `$$SE(\widehat{N}) = 0.0028 \times 10,000 = 28.28$$` #### final abundance estimate: .darkred[ `$$\widehat{N} = 80$$` `$$95\%\, CI(\widehat{N}) = \widehat{N} \pm {1.96 \times SE(\widehat{N})} = \{24.5, 135\}$$` ] ] --- .pull-left[ #### This is why you *want* lots of transects: ![](Lecture06b_CountingAnimals_Uncertainty_files/figure-html/unnamed-chunk-4-1.png)<!-- --> To capture variation! ] -- .pull-right[ #### This is also why you go along the **gradient** of variation: ![](Lecture06b_CountingAnimals_Uncertainty_files/figure-html/unnamed-chunk-5-1.png)<!-- --> .red[**gradient**] - means slope of (steepest) change ] --- ## More complex formulae from Fryxell book Chapter 12: ![](images/fryxellformulae.png) These are used when **sampling areas** are unequal, and account for differences when sampling **with replacement** or **without replacement**. --- ## Simple-SWR - **Simple:** Equal sized sampling frames `\(a_i\)` all equal - **SWR:** Sampling 'with replacement', i.e. frames *OVERLAP*; some individuals counted more than once.. $$SE(\widehat{D}) = {1 \over a_i \sqrt{k(k-1)}} \times \sqrt{\sum n_i^2 - {\left(\sum n_i \right)^2 / k}} $$ `$$SE(\widehat{N}) = A \times SE(\widehat{D})$$` variable | meaning | in book :---:|:---:|:---: `\(k\)` | number of units sampled | .green[*n*] `\(a_i\)` | the area of a *single* unit | .green[*a*] `\(n_i\)` | an individual sample count |.green[*y*] `\(A\)` | total study area | --- ## Example: .pull-left[ ![](Lecture06b_CountingAnimals_Uncertainty_files/figure-html/unnamed-chunk-6-1.png)<!-- --> .red[ **data:** counts = {2,3,1,1} a = 100; A = 10,000] ] -- .pull-right[ - `\(\widehat{N} = {2+3+1+1 \over 100} \times 10,000 = 70\)` - `\(SE(\widehat{N}) = {10,000 \over 100 \times \sqrt{4 \times 3}} \times \\\sqrt{(1 + 1 + 9 + 4) - {(1+1+3+2)^2\over 4}}\)` - `\(SE(\widehat{N}) = {50 \over \sqrt{3}} \times \sqrt{15 - {49 \over 4}} = 48\)` - `\(\widehat{N} = 70; \,\, 95\% \textrm{CI} = (-23, 164)\)` .green[*Anything wrong with this confidence interval?*] ] --- ## Example: More Heterogeneity .pull-left[ .darkred[ **data:** - counts = {0,0,0,8} - a = 100; A = 10,000 ] ![](Lecture06b_CountingAnimals_Uncertainty_files/figure-html/unnamed-chunk-7-1.png)<!-- --> ] -- .pull-right[ .green[ `\(\widehat{N} = 80\)` ] `\(SE(\widehat{N}) =\\ {50 \over \sqrt{3}} \times \sqrt{{(0+0+0+8)^2 - {(0+0+0+8^2) \over 4}}} = \\ = {50 \over \sqrt{3}} \times \sqrt{48} = {50 \over \sqrt{3}} {4\sqrt{3}} = 200\)` .green[ `$$95\% \textrm{CI} = (-130, 470)$$` ] .darkred[**Enormous confidence intervals, because of enormous variability in samples!**] ] --- ## Simple - SWOR - **SWOR:** Sampling *without* replacement, i.e. design guarantees no individual is counted more than once. `$$SE(\widehat{D_{swor}}) = SE(\widehat{D_{swr}}) \times \sqrt{1-{a / A}}$$` The larger the proportion sampled (*coverage*) - the smaller the **sampling error.** -- ## Ratio (SWR/SWOR) **Ratio**: unequal sample frames .blue[(e.g. both hula hoops and meter squares)]. - `\(\widehat{D} = \sum n_i / \sum y_i\)` (same as before) - Standard errors: more complicated ... see formulae. --- ## Take-aways - Is it **very important** to quantify uncertainty! But also, can be **hard** (and **disheartening**). -- - Larger samples & higher coverage `\(\to\)` smaller errors `\(\to\)` narrower confidence intervals `\(\to\)` more precision. -- - The error estimates take into account **sample randomness**, but also **heterogeneity**. The more **heterogeneous** the distribution the larger the errors. -- - Which is why ... .green[you take that heterogeneity into in your **estimates!**] --- # One more formula ... for combining estimates .pull-left[ If you have multiple sub-count estimates (e.g. one for each of `\(r\)` sub-region): .darkred[ - `\(\widehat{N_1}, \widehat{N_2}, ..., \widehat{N_r},\)` ] and each estimate has a standard error: .darkred[ - `\(SE(\widehat{N_1}), SE(\widehat{N_2}), ..., SE(\widehat{N_r})\)` ] Then ... ] -- .pull-right[ ... the **total** estimate will be: .darkgreen[ `$$\widehat{N} = \sum_{i = 1}^r \widehat{N_i}$$` ] and the standard error will be: .darkgreen[ `$$SE(\widehat{N}) = \sqrt{\sum_{i = 1}^r SE(\widehat{N_i})^2}$$` ] ] -- .center[Will the estimate be more precise? **You get to test this out in the field!**]