NATIONAL EMPLOYMENT SURVEY SAMPLE DESIGN TECHINICAL SUBDIRECTORATE RESEARCH AND DEVELOPMENT DEPARTMENT SEPTEMBER 2006 TABLE OF CONTENTS Pages 1 2 3 4 5 6 7 8 9 10 11 11.1 12 13 14 14.1 14.2 15 16 16.1 16.2 17 Introduction 3 Goals of the survey 4 Target population 4 Sample frame and stratification 4 Estimate levels 6 Sample and analysis units 6 Sample size 6 Average size of dwellings by sections 8 Selection of sampling units 9 Sample distribution 9 Dwellings rotation in the sample 10 Estimation method: factors of expansion 10 Construction of the factor of expansion 11 Total standard estimator 12 Ratio estimator separate from total 13 Variance estimators 14 Estimator of the variance for a total 14 Estimator of the variance for Unemployment Rate 14 Estimator of the standard estimator's variances (not adjusted by population) 15 Sampling error 15 Absolute error of an estimator at 95% confidence 15 Relative error of an estimator at 95% confidence 15 Confidence interval at 95% 15 APPENDICES APPENDIX 1: Sample by estimate areas and relative error for unemployed (Base: Oct-Dec 2003) APPENDIX 2: Procedure for the selection of first- and second-phase units APPENDIX 3: Areas with difficult access (ADAs), deducted from sample APPENDIX 4: Comparison between current estimate levels and levels in the new proposal APPENDIX 6: Large or big cities a) Greater Santiago area b) Greater Valparaíso area c) Greater Concepción area d Greater Temuco area e Greater La Serena area APPENDIX 7: Estimators of Totals, Ratios, Rates and Variances. Procedures for calculating estimators used in the National Employment Survey BIBLIOGRAPHY 16 16 19 19 21 24 24 25 25 25 25 25 25 29 _____________________________________________________________________________________ 2 NATIONAL STATISTICS INSTITUTE Introduction In general terms, the sample selection design for the National Employment Survey (ENE in Spanish) 2006 is similar to the one used in 19961, that is, a two-stage probability selection method with geographic stratification by region and urban-rural area. The estimators associated to the design are not self-weighted and are corrected by an exogenous forecast of the population, computed using demographic methods according to the National Statistics Institute (INE) and the Latin American and Caribbean Demographic Centre (CELADE). Substantially, the new design is identical to the one in effect. In this sense, the differences are more related to allowing the measurement of more labour market phenomena than the design in force up to 2005, as we will show further down this document. Among the relevant differences, the update of the sample framework SIEH (Spanish for acronym for Integrated Household Surveys System) with data from the latest Household and Population Census, carried out in 2002, stands out. 1 National Statistics Institute (1996), NATIONAL EMPLOYMENT SURVEY METHODOLOGY _____________________________________________________________________________________ 3 NATIONAL STATISTICS INSTITUTE 1 Goals of the survey The survey aims to depict Chilean population above 15-year old, laying special emphasis on the Labour Force. This characterization encompasses several profiles, including: gender, age, occupation group, activity branch, occupation category, and education level. The sample is representative at national level (urban-rural); regional (urbanrural); large urban centres (cities with more than 40,000 habitants); and rest of urban area. The sample design takes into account the difference between circumstantial and structural phenomena in the labour market. Similarly, this was included in the creation of the new master framework for the SIDEH, with permanent update being one of its most important improvements. 2 Target population The target population of the sample for the National Employment Survey - or ENE - is made up of all the habitants of the country living in occupied private dwellings. This definition excludes all people living in collective dwellings such as hospitals, jails, convents, quarters and others, but includes people residing in private dwellings inside such facilities, such as doorkeepers, janitors and other. Also, people living in areas of difficult access, known as ADAs, fall outside the geographic scope. 3 Sample frame and stratification a) Sample frame Given the characteristics of the areas frame, the sample frame development was based on mapping and base material from the 2001 Pre-census and the 2002 Household and Population Census, made up of region, province, municipality and district maps including boundaries of urban and rural areas as well as demarcation of ADAs. Each and every one of the elements making up the frame has a known, non-zero probability of being selected. Structure of the strata in the sample frame: In accordance with geographical features of the sample frame by areas, the strata to be considered according to Political-Administrative Division (DPA) are communes, and within them, urban and rural areas. The frame does not consider difficult access areas (ADA). Exclusion areas (ADAs): Difficult access areas (ADA) are those that because of weather conditions, topographic aspects or lack of clear roads or transport (see Appendix 4) remain isolated during part or the whole year. Sample frame division into clusters or sections: Considering the characteristics of the area framework, the construction of the new sample frame _____________________________________________________________________________________ 4 NATIONAL STATISTICS INSTITUTE was based on national mapping material, made up of region and municipal maps, including demarcation of urban and rural areas, and ADA’s. With this background, a division of the national territory into strata and geographic areas or clusters, called sections, was undertaken. Sections were made up considering the population volume and number of dwellings. The size of each section varies among the different strata making up the country's provinces or regions. Table No. 1: Section average size by stratum Average size of Stratum dwellings by section City 100 - 200 dwellings Rest of the urban 80 -150 dwellings area (RUA) Rural 60 -80 dwellings For the construction of the sections, stability over time, easy identification on field (respecting district boundaries both in urban and rural sections), homogeneity 2 and compactness3 were also taken into account. Coverage or scope: The sample frame for SIEH 2003 used a national geographic scope taking into account the entire continental territory of Chile, excluding ADAs. ADAs are determined at a stage prior to the development of the sample frame and account for 0.53% of occupied private dwellings in the national territory. b) Stratification of the master sample frame for ENE: The master sample frame is divided into communes or municipalities, but for comparison purposes, this segmentation as attuned with ENE current strata, adding those resulting from the division of such municipalities. The result was a total of 158 strata by geographic condition (political-administrative division), number of dwellings and population contained in the 2002 Housing and Population Census, according to the following definitions: Cities or large urban centres (CD): Made up of cities or groups of adjacent cities with 40,000 or more inhabitants. Rest of urban area (RUA): Group of dwellings with less than 40,000 habitants and more than 2,000 inhabitants, or between 1,001 and 2,000 with 50% or more of the economically active population engaged in secondary or tertiary activities. Urban areas (R): Group of dwellings, concentrated or scattered, with 1,000 or fewer inhabitants, or between 1,000 and 2,000 people, with less than 50% of the economically active population engaged in secondary or tertiary activities. 2 3 Classification of the section within the stratum regarding the Unemployed, by error ρ . Referring to similarity of sections within a stratum in relation to estimates and errors. _____________________________________________________________________________________ 5 NATIONAL STATISTICS INSTITUTE 4 Estimate levels Disaggregation of estimate levels throughout the country is shown bellow: NATIONAL: Urban areas, large cities and rest of urban areas; Rural. REGIONAL: Urban and rural. PROVINCES AND LARGE CITIES (40,000 inhabitants or more). 5 Sample and analysis units First-phase units are the sections for well-defined geographic clusters or sections containing approximately 60-200 dwellings each, depending on the strata selection. Second-phase units are private dwellings found in each section (first-phase units). At the same time, the units of analysis are all people making up a household at selected occupied private dwellings. 6 Sample size The sample size was determined based on total dwellings, in order to make it compatible with finding a small-scale sample error for the total estimate of unemployed, of about 2% as relative error, at a confidence interval of 95%. The formula used to define the size, in number of sections, corresponds to a simple random sample adjusted to the design effect (DEFF). 2 ⎛ z 0,95 ⎞ S h ⋅ M h nh = ⎜⎜⎝ ea ⎟⎟⎠ mh ⋅ Deff h 2 Number of sections in the sample in stratum h: 2 158 Resulting on a total size of: n = ∑ n h = 3,558 sec tions h =1 Where: n e h ah S 4 2 h : Number of sections by stratum : Absolute error of the total of Unemployed by stratum h : Value of quasi-variance estimated by stratum Mh : Number of dwellings contained in stratum h as of the 2002 census Deffh : Refers to the “design effect” in stratum h, computed as the quotient between a standard estimator variance given the design in phases and the same estimator given a dwellings simple random sample4. Details of this calculation are provided in Appendix 8 of this document or in INE's Methodology Department report titled “Cálculo del efecto del diseño en la muestra nacional del empleo en el trimestre Octubre- _____________________________________________________________________________________ 6 NATIONAL STATISTICS INSTITUTE The absolute error was computed based in the total number of unemployed people in each stratum, data obtained and processed from the employment survey carried out in Oct-Dec 2003. This value was considered fixed in order to define the resulting sample size at a later stage and therefore to maintain the original sample errors to guarantee accuracy from a 2% coefficient of variation at national level. For sampling allocation at different strata, besides sampling errors5 pyramid structure, other relative factors from each stratum were taken into account, such as: 1. Unemployment rate by stratum 2. Coefficient of variation 3. Update and interviewing costs 4. Population variance, in order to define and analyse margins of error for different strata and be able to choose an appropriate sample size. According to design phases, the sample size is disaggregated as follows: • First phase units: 3,558 sections • Second phase units: 34,511 private dwellings Table 2 shows the sample sizes at national and regional level by rural urban area. It should also be noted that in some areas of estimation the relative error is very high due to a low unemployment rate (for example, in the rural area of region IX). To narrow error in these cases, with very small rates, it would be necessary to expand the sample into prohibitive levels6. 5 6 Noviembre –Diciembre 2003” (Calculating the design effect on the national employment sample for October, November and December 2003). That is to say, the error is higher as it disaggregates from the structure: National, Urban, Rural; Regional, Urban, Rural; Communal (Municipalities) and Large Centres, and strata. In estimate levels with low unemployment rates, it is not necessary to know them exactly. In this case, the order of magnitude suggested by the estimate and confidence interval is enough to conclude that unemployment at these levels is not a significant problem compared to other levels where it is urgent to know if the rate falls from one or two points with appropriate public policies. B uena R e g u la r A c e p ta b le M a la B R A M M e n o r o ig u a l q u e : E n tre : E n tre : M a y o r o ig u a l q u e : RANGO % d e E r r o r R e la tiv o a l 95% 10 11 30 31 50 51 c o e fic ie n te d e v a r ia c ió n 5 ,1 5 ,6 1 5 ,3 1 5 ,8 2 5 ,5 2 6 ,0 - _____________________________________________________________________________________ 7 NATIONAL STATISTICS INSTITUTE 7 Average size of dwellings by sections The average size of dwellings by sections at each stratum was defined taking into account the minimization problem of total unemployment variance, using a cost function where travel expenses among units are of little importance. Assuming a fixed number of sections by stratum, the formula is: mh _ óptimo = S S −S w 2 b C1 C 2 2 w Mh Where: C = C1 ⋅ n + C 2 ⋅ n ⋅ m . C: Cost function of surveying in stratum h c ⋅ n : Proportionate to the number of sample primary units 1 c ⋅ n ⋅ m : Proportionate to total number of second phase units 2 nh S 2 = b ∑m i =1 hi (Y n S 2 w = i =1 j =1 n hij − Yhi ) ∑m − n h ) 2 : Variance between sections (primary units) : Variance between dwellings within sections (secondary units) nh i =1 M − Yh −1 h n h m hi ∑ ∑ (y hi hi h : Number of housing dwellings in stratum h : Number of sections in the sample within stratum h h m h m hi : Number of dwellings in the sample within stratum h : Number of dwellings in stratum h, section i _____________________________________________________________________________________ 8 NATIONAL STATISTICS INSTITUTE As such, the number of dwellings to be included in each section by stratum is set according to the following table: Table No. 3: Average number of dwellings to be surveyed, by section and stratum 8 Stratum Section average size Number of dwellings by section City Rest of urban area Rural 100 - 200 dwellings 80 -150 dwellings 60 -80 dwellings 8 (with 1 more as replacement) 12 15 Selection of sampling units The selection of first phase units was performed at each geographic stratum with probability proportionate to section sizes, that is to say, according to the number of dwellings (see Appendix 3). The probability of inclusion for the i-th primary unit (section), which is proportionate to the section size, is equal to: π hi = nh ⋅ M hi M , h Where " h" represents the stratum index CD, RUA/U, R, " M hi " , the total number of dwellings in section "i" of stratum " h" , as of Census 2002; and " M h " is the total number of dwellings in stratum " h" as of the same census. The selection of second phase units (dwellings within each stratum) was undertaken with equal probability for dwellings in each section, using a systematic selection with probability π hij = m hi M ´ hi where M ' hi is the updated number of dwellings in section i of stratum h, and mhi is the number of dwellings to be surveyed in section i of stratum h. Everybody at each dwelling is surveyed so that there is not a third sampling phase (see Appendix 2). 9 Sample distribution The total sample (3,558 sections) is distributed in a quarter and split in three samples with more or less similar size; each one is assigned to one of the three months. Therefore, survey on each of these smaller samples lasts one month, and each one by itself is not representative of estimate levels. As result, all dwellings surveyed in month "t", are surveyed again in month "t+3" and each dwelling in the sample is surveyed only once by quarter7. 7 Data from each month is accumulated and included in the entire sample. _____________________________________________________________________________________ 9 NATIONAL STATISTICS INSTITUTE Estimates for period (month) "t" are computed using data from months "t", "t-1", and "t-2". 10 Dwellings rotation in the sample Rotation of sections is a procedure aimed at keeping the sample up-to-date and avoiding overusing the informants. This procedure involves updating numbering of the section after a certain period in order to register changes and possible rises or falls occurred, and to subsequently make a new dwellings selection, aimed at incorporating the changes in the estimation of figures provided by the sample and at the same time, avoiding "overusing" the informants. To perform the rotation, the sample sections are divided into "shift turns" (Tramos de Rotación - TR), which are defined considering approximately 1/6 of total urban sections (CD and RAU) and approximately 1/12 of rural sections from each region every month. This enables the rotation of the entire sample over an 18-month period for urban sections and in a 36- month period for rural sections, therefore reaching the goal of keeping it constantly up to date and valid, by incorporating relevant changes. 11 Estimation method: weighting factors Basic population parameters estimated in the ENE are total number of people in the labour force and unemployed people within it. Therefore the survey's most significant parameter is the unemployment rate, which is the quotient from previous totals. This quotient is computed by dividing the respective estimates. For example to calculate the total number of unemployed within the labour force, an estimator to “expand” the value of the variable for each person in the sample to the universe is used, according to the following formula: YˆSep = ∑ Fhi( 2 ) ⋅ y j hij Where: • y j • The Number of people in dwelling j F ( 2) hi term is referred to as the expansion factor of section i in stratum h • These factors are created so as the estimator has suitable statistical features and they depend on sample size, that is to say, on the sample selection method and on an adjustment for the projected population. • The factor of expansion can be construed as the amount of people in the population represented by one person in the sample. _____________________________________________________________________________________ 10 NATIONAL STATISTICS INSTITUTE • To calculate other totals, such as the number of 15 to 24 year old people in the labour force- only a new “dummy” variable of interest with the required condition needs to be defined. 11.1 Construction of the factor of expansion The factor of expansion is defined as: SE ' ( 2) h hi h hi SE h hi hi h F = M M P ⋅ ⋅ n ⋅M m Pˆ Where: Mh : Number of dwellings in stratum "h" according to the 2002 Household and Population Census Mhi : Number of dwellings in section "i" according to the 2002 census nh : Number of sections in the sample of stratum "h" M 'hi : Number of dwellings updated in section "i" m hi : Number of dwellings in the sample of section "i", stratum h: PhSE : Projection of the number of people of gender “S” and age range “E” according to the 2002 Household and Population Census, in stratum h PˆhSE : Estimated number of people on gender “S” and age range “E”8, in stratum h, according to a standard estimator, which will be addressed in detail further down. To explain the statistical logic of the factor, it is convenient to view it as the product of two components: SE (1) ( 2) h hi hi SE h F Where: 8 (1) hi F = F ⋅ P Pˆ Mh M hi' = ⋅ nh ⋅ M hi mhi Age ranges or groups: E1 = under 15 years old; and E2 = 15 years old o more. _____________________________________________________________________________________ 11 NATIONAL STATISTICS INSTITUTE (1) The factor of expansion Fhi , commonly called theoretic or standard factor, depends only on the sample design and can be construed as the inverse probability of inclusion in the sample of person k. This probability of being selected for the sample is the product of the selection probabilities in the first phase (sections) by the selection probability in the second phase (dwellings) by total sections in the stratum: π hij = π hi ⋅ π j / hi : Probability of dwelling j, in section i and stratum h 9 of being selected or included in the sample π hi = π j/hi = n h ⋅ M hi Mh m hi M 'hi : Probability of i-th section in stratum h : Conditional probability of being selected for dwelling j, section i, in stratum h Where: Mh : Number of dwellings in stratum "h" according to the 2002 Household and Population Census. Mhi : Number of dwellings in section "i" according to 2002 census. nh : Number of sections in the sample in stratum "h" M’hi : Number of updated dwellings in section "i" according to the 2002 Household and Population Census. mhi : Number of dwellings in the sample of section "i", stratum "h". 12 Total standard estimator Component Fhi(1) defines by itself an expansion factor, which gives way to the standard estimator Yˆ 9 10 10 hi of the total: = ∑ j , i ∈h Fhi( 1 ) ⋅ yhi It is worth noting that the probability changes according to the stratum and section, but it is constant for all people in each dwelling of a section. This estimator is called Horvitz-Thompson (1952). For further information about its characteristics, see Carl-Erik Sarndal, Bengt Swensson, Jan Wretman (1992), Model Assisted Survey Sampling, page 43. _____________________________________________________________________________________ 12 NATIONAL STATISTICS INSTITUTE Where yhi is the total number of people with the characteristic of interest This estimator turns out unbiased and consistent in regard to the variable of interest total, however its statistical efficiency (average quadratic error) can be SE improved by adjusting it to ratio Ph PˆhSE , resulting into the final estimator used in ENE to compute totals. With the definition of the standard estimator, the adjustment factor description SE PhSE PˆhSE is complete, since Pˆh = ∑F j ,i ∈h (1) hi ⋅ PhiSE is the standard estimate of the SE number of people in stratum h, gender S and age E, where Phi is the number of people in section i of stratum h, by gender S and age E. 13 Ratio estimator separate from total The estimator of the total yˆ h = ∑ Fhi(2) ⋅ yhi can be expressed more compactly, i∈h as the sum of standard estimators by stratum adjusted by the population projection in the respective stratum: PhSE ˆ yˆ h = ∑ SE ⋅ Yhi ˆ i∈h P h This estimator, called separate ratio estimate, has the following attributes: 1. It is consistent. 2. Average quadratic error is lower than the standard estimator. 3. It is biased, but the bias is negligible compared to benefits of estimation. 4. It allows reducing the non-response bias, since it has the quality that when estimating total people by gender and above 15 years old in each stratum h, it matches exactly population projections in stratum h. ( 2) The final attribute implies that when adding factors Fhi for all people above 15 years old in the sample, the result is the projection for stratum h. But if factors Fhi(1) are added up for all dwellings in the sample within the study domain (1 factor by dwelling), the result is an estimate of the number of dwellings within the domain. _____________________________________________________________________________________ 13 NATIONAL STATISTICS INSTITUTE 14 Variance estimators The variance or standard error of an estimator allows assessing the quality of an inference, since by knowing estimators’ variance ranges you can decide with certainty if a given phenomenon can be observed or not. The following formulae are inferred considering the two-phase stratified sampling design. The previous assumption allows simplification of formulae and obtaining conservative estimators, that is to say, their estimates overestimate true variances. 14.1 Estimator of the variance for a total Vˆar ( Yˆh ) = nh (1) ⋅ ∑ ⎡ F hi ⋅ n h − 1 i∈h ⎢⎣ y hi −R ˆ 2 ⋅ (1)⋅ p hi ⎤⎥⎦ h F hi Where: Yhi = ∑y hij : Number of people, for example, unemployed in the sample in Vivienda j section "i" phi = ∑ phij : Number of people above 15 years old in the sample, in section Vivienda j "i" Rˆ h = Yˆh Pˆh : Ratio between the number of people of the variable of interest and total people in the stratum, with 14.2 yˆ = ∑ F ⋅ y (1) h hi i∈h and hi pˆ = ∑ F ⋅ p (1) h hi i∈h hi . Estimator of the variance for Unemployment Rate ∧ TD h = ˆ Y Estimator of Total Unemployed h = ˆ Estimator of Total Labour Force Xh ⎡ (1) nh 1 ⎛ ∧ ⎞ Vˆar ⎜ TD ⎟ = ∑ ∑ ⎢( Fhi Yhi − nh ⎝ ⎠ h (nh − 1) i ⎣ ∑F i ∧ Y ) − TD⋅ ( Fhi(1) X hi − (1) hi hi 1 nh ∑F (1) hi i ⎤ X hi ) ⎥ 2 ⎦ _____________________________________________________________________________________ 14 NATIONAL STATISTICS INSTITUTE 15 Estimator of population) the ( Vˆar YˆEs tan standard ) estimator's variances ⎛ n 1 = ∑ h ⋅ ∑ ⎜⎜ Fhi(1)Yhi − nh h nh − 1 i ⎝ ∑ (1) hi F i (not ⎞ ⋅ Yhi ⎟⎟ ⎠ adjusted by 2 Where: Yhi = ∑y hij : Total unemployed people, for example, in the sample in section "i". Vivienda j 16 Sampling error Sampling errors of an estimator θˆ in any parameterθ, for instance, from a total or rate, are the Absolute Error and the Relative Error as defined bellow. 16.1 Absolute error of an estimator at 95% confidence ea (0,95) = Absolute Error = 1,96 ⋅ Vˆ (θˆ) . 16.2 Relative Error of an estimator at 95% confidence Relative Error = AbsoluteError θˆ = ea (0,95) θˆ The relative error with a 68% confidence interval is known as “Coefficient of Variation” of an estimator and is the result from the quotient between the estimate’s standard deviation (square root of the variance) and the estimate’s value: Vˆ (θˆ) ˆ ˆ CV (θ ) = θˆ . 17 Confidence interval at 95% Once the variance of an estimator θˆ has been calculated, it is possible to obtain the confidence interval for a parameter θ of the total of a variable with 95 % confidence. (θˆ − e a (0,95) ; θˆ + ea (0,95) ) _____________________________________________________________________________________ 15 NATIONAL STATISTICS INSTITUTE APPENDIX 1: Sample by estimate areas and relative error for the unemployed (Base: Oct-Dec 2003). No. of sections FRAME 2002 32,964 No. of dwellings FRAME 2002 4,000,403 URBAN 26,279 3,476,817 LARGE CENTRES 18,759 NAME, LEVEL AND STRATUM NATIONAL Unemployed Unemployment rate Sections sample Dwellings Coefficient Absolute Relative sample of variation error error 461,072 0.08 3,551 34,455 1.8 16,399 3.6 432,988 0.09 3,018 26,460 1.9 16,120 3.7 2,808,345 367,021 0.09 2,439 19,512 2.1 15,256 4.2 RURAL URBAN AREA (RUA) RURAL 7,520 668,472 65,967 0.07 579 6,948 4.0 5,209 7.9 6,685 523,586 28,084 0.04 533 7,995 5.5 3,011 10.7 REGION I 10.8 1,132 108,142 19,444 0.09 158 1,398 5.5 2,093 ARICA CITY 446 45,409 8,369 0.11 65 520 8.4 1,376 16.4 IQUIQUE CITY 405 41,573 5,904 0.08 60 480 9.8 1,130 19.1 ALTO HOSPICIO CITY 117 12,388 4,815 0.09 10 80 11.2 1,061 22.0 16.6 REGION II 1,012 111,756 13,634 0.08 144 1,317 8.5 2,269 CALAMA CITY 274 28,063 2,351 0.05 60 480 16.7 768 32.7 ANTOFAGASTA CITY 538 68,579 9,806 0.10 48 384 11.0 2,114 21.6 REGION III 778 62,427 10,032 0.09 145 1,371 5.8 1,138 11.3 COPIAPO CITY 338 28,959 3,276 0.08 60 480 10.7 687 21.0 VALLENAR CITY 133 11,481 3,075 0.13 42 336 10.4 627 20.4 REGION IV 1,701 164,615 15,208 0.07 211 2,185 8.1 2,408 15.8 URBAN 1,219 126,752 14,401 0.09 164 1,480 8.4 2,365 16.4 COQUIMBO CITY 314 39,951 5,533 0.10 38 304 15.0 1,625 29.4 LA SERENA CITY 302 39,249 5,575 0.11 37 296 13.7 1,494 26.8 OVALLE CITY 217 17,742 1,362 0.08 47 376 17.4 464 34.0 ELQUI PROVINCE 852 98,380 11,563 0.09 104 1,011 9.8 2,223 19.2 LIMARI PROVINCE 535 43,527 1,871 0.04 72 724 16.2 595 31.8 CHOAPA PROVINCE 314 22,708 1,773 0.06 35 450 20.4 710 40.0 RURAL 482 37,863 807 0.02 47 705 28.7 454 56.2 REGION V 4,065 449,301 63,176 0.11 400 3,822 4.8 5,944 9.4 URBAN 3,590 415,291 61,076 0.11 358 3,192 4.9 5,897 9.7 RURAL 475 34,010 2,100 0.04 42 630 18.0 742 35.3 PETORCA PROVINCE LOS ANDES PROVINCE LOS ANDES CITY SAN FELIPE PROVINCE QUILLOTA PROVINCE VALPARAISO PROVINCE VALPARAISO CITY 261 18,867 3,147 0.10 31 402 12.1 747 23.8 313 24,831 1,590 0.05 43 424 18.7 583 36.6 185 15,246 1,058 0.05 29 232 22.5 467 44.1 460 36,312 3,611 0.07 40 504 13.3 943 26.1 536 47,233 2,962 0.06 46 482 18.3 1,063 35.9 1,795 258,227 43,676 0.13 160 1,325 6.4 5,490 12.6 488 76,355 15,884 0.15 54 432 9.1 2,841 17.9 VIÑA DEL MAR CITY 663 102,724 12,973 0.19 66 528 11.4 2,899 22.3 QUILPUE CITY 248 37,495 7,497 0.16 22 176 14.8 2,172 29.0 VILLA ALEMANA CITY SAN ANTONIO PROVINCE 186 28,965 5,578 0.17 16 128 21.9 2,389 42.8 497 39,891 6,054 0.12 51 453 10.0 1,184 19.6 _____________________________________________________________________________________ 16 NATIONAL STATISTICS INSTITUTE GRAN VALPARAISO No. of sections FRAME 2002 1,531 No. of dwellings FRAME 2002 236,936 SAN ANTONIO CITY 285 22,620 QUILL-CAL-CR CITIES 322 33,253 NAME, LEVEL AND STRATUM Unemployed Unemployment rate Sections sample Dwellings sample Coefficient Absolute Relative of variation error error 40,850 0.13 151 1,208 6.4 5,133 12.6 4,627 0.14 42 336 9.6 866 18.7 2,289 0.18 47 376 18.3 821 35.9 REGION VI 2,323 207,603 7,440 0.03 218 2,372 11.3 1,654 22.2 URBAN 1,542 148,272 5,414 0.04 158 1,472 14.2 1,507 27.8 RURAL 781 59,331 2,026 0.02 60 900 17.2 681 33.6 RANCAGUA CITY SAN FERNANDO CITY CACHAPOAL PROVINCE COLCHAGUA PROVINCE CARDENAL CARO PROVINCE 458 57,026 1,619 0.05 65 520 21.2 673 41.6 159 13,592 852 0.05 41 328 18.3 306 35.9 1,496 145,051 5,617 0.04 109 1,120 14.2 1,567 27.9 618 51,386 1,591 0.02 78 832 16.3 507 31.9 209 11,166 233 0.02 31 420 32.3 147 63.2 REGION VII 2,815 248,150 33,300 0.09 278 3,025 5.4 3,510 10.5 URBAN 1,813 167,355 26,114 0.11 195 1,780 6.2 3,157 12.1 RURAL 1,002 80,795 7,187 0.06 83 1,245 10.9 1,532 21.3 CURICO CITY 311 25,472 5,586 0.14 40 320 10.0 1,091 19.5 TALCA CITY 426 53,504 8,386 0.11 57 456 10.2 1,678 20.0 LINARES CITY 209 17,621 1,762 0.08 43 344 15.4 531 30.2 TALCA PROVINCE LINARES PROVINCE CAUQUENES PROVINCE CURICO PROVINCE 958 97,024 13,028 0.10 85 849 9.3 2,380 18.3 829 68,812 7,207 0.08 87 959 12.2 1,719 23.9 232 16,498 762 0.04 35 468 22.1 329 43.2 796 65,816 12,304 0.12 71 749 7.9 1,894 15.4 REGION VIII 4,573 483,336 50,945 0.08 496 4,820 4.5 4,465 8.8 URBAN 3,452 397,465 44,843 0.08 424 3,740 4.8 4,196 9.4 RURAL 1,121 85,871 6,102 0.05 72 1,080 12.8 1,525 25.0 CHILLAN CITY 357 44,017 4,925 0.17 62 496 13.2 1,271 25.8 LOTA CITY 107 12,539 2,276 0.13 44 352 15.1 672 29.5 CORONEL CITY 191 23,550 3,347 0.12 44 352 14.0 919 27.5 LOS ANGELES CITY 2,402 311,251 37,716 0.28 305 2,672 5.6 4,151 11.0 ÑUBLE PROVINCE CONCEPCION PROVINCE CONCEPCION CITY 1,221 112,371 8,270 0.06 98 1,071 11.3 1,833 22.2 1,797 237,308 30,097 0.09 247 2,092 6.1 3,616 12.0 TALCAHUANO CITY ARAUCO PROVINCE BIO-BIO PROVINCE GREATER CONCEPCION AREA 655 100,237 11,184 0.24 66 528 11.8 2,576 23.0 409 61,730 8,945 0.20 44 352 11.2 1,964 22.0 485 37,365 4,014 0.07 45 570 11.0 862 21.5 1,349 130,499 12,391 0.07 140 1,359 8.4 2,050 16.5 1,139 173,380 22,196 0.09 119 952 7.6 3,325 15.0 REGION IX 2,543 234,996 14,406 0.05 212 2,252 7.7 2,186 15.2 URBAN 1,609 159,098 13,202 0.07 160 1,472 7.7 1,994 15.1 RURAL 934 75,898 1,205 0.01 52 780 37.9 894 74.2 ANGOL CITY 141 11,703 1,089 0.08 40 320 17.7 378 34.8 TEMUCO CITY MALLECO PROVINCE CAUTIN PROVINCE 558 70,462 6,887 0.15 72 576 11.5 1,554 22.6 1,008 104,609 8,681 0.05 100 1,018 9.7 1,644 18.9 1,460 121,075 4,815 0.05 99 1,130 14.4 1,359 28.2 _____________________________________________________________________________________ 17 NATIONAL STATISTICS INSTITUTE No. of sections FRAME 2002 3,084 No. of dwellings FRAME 2002 276,360 21,822 0.06 295 3,196 URBAN 2,038 191,644 17,618 0.07 211 1,936 RURAL 1,046 84,716 4,204 0.03 84 VALDIVIA CITY 316 33,400 2,973 0.07 OSORNO CITY 359 36,638 2,873 0.07 NAME, LEVEL AND STRATUM REGION X Unemploy ed Unemploym ent rate Sections sample Dwellings Coefficient sample of variation Absolute error Relative error 6.0 2,573 11.8 6.5 2,228 12.6 1,260 15.6 1,288 30.6 54 432 14.3 836 28.1 48 384 17.6 993 34.6 PUERTO MONTT CITY 369 38,892 3,194 0.07 47 376 15.2 952 29.8 VALDIVIA PROVINCE 1,064 94,405 7,321 0.06 99 1,044 11.1 1,596 21.8 OSORNO PROVINCE LLANQUIHUE PROVINCE CHILOE-PALENA PROVINCE REGION XI 667 61,953 3,978 0.06 72 723 14.5 1,134 28.5 866 79,676 4,379 0.04 83 874 12.8 1,099 25.1 487 40,326 6,144 0.09 41 555 10.4 1,258 20.5 384 24,886 1,926 0.05 106 954 12.3 465 24.1 COIHAIQUE CITY 168 12,473 1,221 0.08 52 416 17.2 411 33.7 PUERTO AISEN CITY 68 4,338 214 0.03 35 280 33.1 139 64.8 REGION XII 546 43,190 4,289 0.07 94 860 9.9 828 19.3 PUNTA ARENAS CITY 408 34,710 3,628 0.07 73 584 11.2 798 22.0 8,008 1,585,641 205,449 0.09 794 6,883 3.2 13,027 6.3 7,618 METROPOLITAN REGION URBAN 1,549,965 203,479 0.09 764 6,356 3.3 13,016 6.4 RURAL GREATER SANTIAGO AREA PUENTE ALTO CITY 432 44,303 2,950 0.04 41 615 13.2 761 25.8 6,993 1,425,287 185,820 0.09 630 5,040 3.5 12,726 6.8 629 127,753 15,283 0.10 63 504 12.0 3,591 23.5 SAN BERNARDO CITY 292 60,770 11,549 0.10 56 448 8.9 2,022 17.5 MELIPILLA CITY 71 14,640 1,560 0.06 38 304 24.0 735 47.1 COLINA CITY CHACABUCO PROVINCE CORDILLERA PROVINCE MAIPO PROVINCE 72 14,717 2,908 0.12 24 192 14.4 818 28.1 200 32,910 5,572 0.05 36 354 12.5 1,364 24.5 692 135,481 16,852 0.10 72 627 10.9 3,599 21.4 515 95,990 15,933 0.09 77 730 8.4 2,628 16.5 265 37,754 2,908 0.05 54 529 20.5 1,169 40.2 306 55,369 6,178 0.07 55 643 11.5 1,393 22.5 MELIPILLA PROVINCE TALAGANTE PROVINCE _____________________________________________________________________________________ 18 NATIONAL STATISTICS INSTITUTE APPENDIX 2: Procedure for the selection of first- and second-phase units First-phase units (sections) will be selected with probability proportionate to size, based on the number of dwellings, following the systematic procedure below. Intervals “N” (= number of sections in the stratum) are constructed as follows: Section No. of Dwellings 1 2 3 Intervals M1 M2 M3 M 1 and M1 M1 + 1 and M1 + M2 M1 + M2 + 1 and M1 + M2 + M3 M N MN M1 +...+ MN-1 + 1 and M M1 + ...+ MN = M Next, a random number “A” is generated between 1 and k = M/n, then, selected sections are determined by the interval to which the amounts belong: A, A + k, A + 2k, ... , A + (n - 1)k. The previous procedure does not enable repetitions and can be demostrated that selection probability of a unit with “Mi” dwellings is “n Mi/M”. APPENDIX 3: Areas with difficult access (ADAs), deducted from the sample Region I Location Valle de Lluta (P) II Entity Situation Difficult access; bad, sandy road, dangerous dunes and brook zone. Sora is a hamlet with difficult Sora access, where the road crosses a fast-flowing river, with no bridge. The entire municipality of Ollagüe. The entire district No.19 in Copiapó is considered Campo Marte ADA, since it is a very remote, Andean area. III Ciénaga Redonda IV Caldera y Damas V Alta Montaña VI Chancón (P) Anita VII El Baúl El Baúl VIII Puerto Sur IX Chilpaco (P) Ex Colonia Penal Chilpaco (P) X Hueyusca (P) El Mirador XI Las Bandurrias (P) XII XIII Entre Vientos El Ingenio (P) La Sombría Los Libertadores Las Bandurrias Monte Bello El Ingenio Difficult access Only in winter-time. Access only possible from January to March. Classification as ADA was ratified with CD. The sector remains classified as ADA, since access is very difficult; a 4WD vehicle is required, on top of good weather condition. Very far away from urban centres. Area of high cost all DC 10 corresponds to Isla Santa María ADA. Chilpaco (P) is considered entity. DC7 Hueyusca hamlet, El Mirador (Cs) entity is an area with difficult access. EXCLUSION AREA AE (MILITAR ZONE) Difficult access _____________________________________________________________________________________ 19 NATIONAL STATISTICS INSTITUTE In the following regions, entire communes/municipalities have been excluded from the sample: • Region II (Ollagüe commune is excluded) • Region V (Isla de Pascua and Juan Fernández are excluded) • Region X (Cochamó, Futaleufú, Hualaihué and Palena have been excluded) • Region XI (Guaitecas, O'Higgins and Tortel are excluded) • Region XII (Cabo de Hornos and the Antarctica are excluded) _____________________________________________________________________________________ 20 NATIONAL STATISTICS INSTITUTE APPENDIX 4: Comparison between current estimate levels and levels in the new proposal Estimate levels used in the National Employment Survey 1996 versus estimate levels proposed for the new ENE, based on Census 2002. Emphasis is put on additional levels in the new proposal for ENE 2006 and levels of the current sample from ENE 96 that would no longer be included. The following table shows the levels in both samples (current and the new proposal): CURRENT LEVELS NEW LEVELS NATIONAL TOTAL URBAN LARGE URBAN CENTRES NATIONAL RURAL-URBAN AREA (RUA) NATIONAL URBAN LARGE CENTRES RUA RURAL RURAL REGION I, TARAPACA ARICA CITY IQUIQUE CITY REGION I ARICA CITY IQUIQUE CITY ALTO HOSPICIO CITY REGION II, ANTOFAGASTA CALAMA CITY CHUQUICAMATA CITY ANTOFAGASTA CITY REGION II CALAMA CITY REGION III, ATACAMA COPIAPO CITY REGION III COPIAPO CITY VALLENAR CITY VALLENAR CITY REGION IV, COQUIMBO URBAN RURAL COQUIMBO CITY LA SERENA CITY OVALLE CITY ELQUI PROVINCE LIMARI PROVINCE CHOAPA PROVINCE ANTOFAGASTA CITY REGION IV URBAN RURAL COQUIMBO CITY LA SERENA CITY OVALLE CITY ELQUI PROVINCE LIMARI PROVINCE CHOAPA PROVINCE _____________________________________________________________________________________ 21 NATIONAL STATISTICS INSTITUTE CURRENT LEVELS NEW LEVELS REGION V, VALPARAISO URBAN RURAL PETORCA PROVINCE LOS ANDES PROVINCE V REGION URBAN RURAL PETORCA PROVINCE LOS ANDES PROVINCE LOS ANDES CITY SAN FELIPE PROVINCE QUILLOTA PROVINCE VALPARAISO PROVINCE GREATER VALPARAISO AREA VALLENAR CITY VIÑA DEL MAR CITY QUILPUE CITY VILLA ALEMANA CITY SAN ANTONIO PROVINCE SAN ANTONIO CITY QUILL-CAL-CR CITIES SAN FELIPE PROVINCE QUILLOTA PROVINCE VALPARAISO PROVINCE GREATER VALPARAISO AREA VALPARAÍSO CITY VIÑA DEL MAR CITY SAN ANTONIO PROVINCE SAN ANTONIO CITY GROUP: QUILLOTA CALERA - LA CRUZ REGION VI, DEL LIBERTADOR GENERAL BERNARDO O'HIGGINS URBAN RURAL RANCAGUA CITY SAN FERNANDO CITY CACHAPOAL PROVINCE COLCHAGUA PROVINCE CARDENAL CARO PROVINCE REGION VII, DEL MAULE URBAN RURAL CURICO PROVINCE CURICO CITY TALCA PROVINCE TALCA CITY LINARES PROVINCE LINARES CITY CAUQUENES PROVINCE REGION VI URBAN RURAL RANCAGUA CITY SAN FERNANDO CITY CACHAPOAL PROVINCE COLCHAGUA PROVINCE CARDENAL CARO PROVINCE REGION VII URBAN RURAL CURICO PROVINCE CURICO CITY TALCA PROVINCE TALCA CITY LINARES PROVINCE LINARES CITY CAUQUENES PROVINCE _____________________________________________________________________________________ 22 NATIONAL STATISTICS INSTITUTE CURRENT LEVELS NEW LEVELS REGION VIII, BIOBIO URBAN RURAL ÑUBLE PROVINCE CHILLAN CITY CONCEPCION PROVINCE GREATER CONCEPCION AREA CONCEPCION CITY TALCAHUANO CITY LOTA CITY CORONEL CITY ARAUCO PROVINCE BIO-BIO PROVINCE LOS ANGELES CITY REGION VIII URBAN RURAL ÑUBLE PROVINCE CHILLAN CITY CONCEPCION PROVINCE GREATER CONCEPCION AREA CONCEPCION CITY TALCAHUANO CITY LOTA CITY CORONEL CITY ARAUCO PROVINCE BIO-BIO PROVINCE LOS ANGELES CITY REGION IX, DE LA ARAUCANIA URBAN RURAL MALLECO PROVINCE ANGOL CITY TEMUCO CITY CAUTIN PROVINCE REGION IX URBAN RURAL MALLECO PROVINCE ANGOL CITY TEMUCO CITY CAUTIN PROVINCE REGION X, LOS LAGOS URBAN RURAL VALDIVIA PROVINCE VALDIVIA CITY OSORNO PROVINCE OSORNO CITY LLANQUIHUE PROVINCE PUERTO MONTT CITY CHILOE-PALENA PROVINCE REGION X URBAN RURAL VALDIVIA PROVINCE VALDIVIA CITY OSORNO PROVINCE OSORNO CITY LLANQUIHUE PROVINCE PUERTO MONTT CITY CHILOE-PALENA PROVINCE REGION XI, AISEN DEL GENERAL CARLOS IBANEZ DEL CAMPO COIHAIQUE CITY PUERTO AISEN CITY REGION XI COIHAIQUE CITY PUERTO AISEN CITY REGION XII, MAGALLANES Y DE LA ANTARTICA CHILENA REGION XII PUNTA ARENAS CITY PUNTA ARENAS CITY _____________________________________________________________________________________ 23 NATIONAL STATISTICS INSTITUTE CURRENT LEVELS NEW LEVELS METROPOLITAN REGION, SANTIAGO URBAN RURAL SANTIAGO PROVINCE GREATER SANTIAGO AREA PUENTE ALTO CITY SAN BERNARDO CITY MELIPILLA CITY MR URBAN RURAL CHACABUCO PROVINCE CORDILLERA PROVINCE MAIPO PROVINCE MELIPILLA PROVINCE TALAGANTE PROVINCE SANTIAGO PROVINCE GREATER SANTIAGO AREA PUENTE ALTO CITY SAN BERNARDO CITY MELIPILLA CITY COLINA CITY CHACABUCO PROVINCE CORDILLERA PROVINCE MAIPO PROVINCE MELIPILLA PROVINCE TALAGANTE PROVINCE APPENDIX 6: Large/big cities As detailed in the classification of Urban Centres, the Metropolis is a country's largest urban representation; it concentrates more than 1,000,000 inhabitants and accounts for a high percentage of the total population. The Metropolis is made up of the urban area of a set of communes or municipalities that have come together as result of “conurbation” processes. a. Greater Santiago area Following the above-described scheme, the Greater Santiago area went “from the level of Large Urban Area defined at the 1992 Census to the level of Metropolis in 2002, due to its high population scale; this level -“Gran Santiago” in Spanish- is subject to different treatment. Santiago Metropolis (2002) Gran Santiago (1992) 32 communes of the Santiago province 32 communes of the Santiago province Puente Alto Municipality Puente Alto Municipality San Bernardo Municipality San Bernardo Municipality Padre Hurtado Municipality Padre Hurtado Municipality Pirque city (Pirque Municipality) * La Obra-Las Vertientes (San José de Maipo Municipality) * * Both City of Pirque with 4,855 inh. and City of La ObraLas Vertientes with (2,477) inh. belong to the Cordillera Province RUA stratum in the new ENE Sample. _____________________________________________________________________________________ 24 NATIONAL STATISTICS INSTITUTE b. Greater Valparaíso area The Greater Valparaíso area is made up of the cities of Valparaíso, Quilpue, Villa Alemana and Viña del Mar. c. Greater Concepción area The Greater Concepción area comprises the cities of Concepción, Chiguayante, Penco, San Pedro de la Paz and Talcahuano. d. Greater Temuco area The Greater Temuco area comprises the cities of Temuco and Padre Las Casas. e. Greater La Serena area Gran Serena comprises the cities of Coquimbo and la Serena. The following estimate levels are to be taken into account for the New 2006 National Employment Survey: - Alto Hospicio City Los Andes City Quilpue City Villa Alemana City Colina City. Chuquicamata City was an estimate level in the 96 ENE and is not included in the new proposal for 2006 ENE. APPENDIX 7: Estimators of totals, ratios, rates and variances. Procedures for calculating estimators used in the National Employment Survey The sample base must contain the survey data and parameters used for the Expansion Factor definition, as well as the formulae required for calculating Averages, Rates and Ratios. This base containing the employment survey data to be projected for the population calls for procedures connected to the sample design. For this purpose, technical algorithms used to expand the selected sample to the universe must be included. And this will be fed into computing programs. Sample factors of expansion The factor of expansion is expressed as the times represented by a selected unit in relation to the universe. Factors used for the National Employment Survey expansion according to design is expressed as follows: _____________________________________________________________________________________ 25 NATIONAL STATISTICS INSTITUTE Theoretical expansion factor in dwelling j of the section, within stratum h FE (1) hi Mh M hi' = ⋅ nh ⋅ M hi mhi Mh = Number of dwellings in stratum h according to the 2002 Household and Population Census Mhi = Number of dwellings in section "i" of stratum h according to the 2002 Census nh = Number of sections in the stratum h sample M 'hi = Number of updated dwellings in section i of stratum h m hi = Number of dwellings in the sample, section i and stratum h. Adjusted expansion factor in dwelling i of section j and stratum h FE ( 2) hi Mh M hi' PhSE = ⋅ ⋅ SE nh ⋅ M hi mhi Pˆh , Where: P SE h PˆhSE = Forecasted number of people on gender “S” and age group “E” (15 years old or more and below 15 years old) according to the 2002 Household and Population Census, in stratum h = Estimated number of people on gender “S” and age group “E” in stratum h SE PˆhSE = ∑ FE (1) hi ⋅ Phij = Standard estimate of people in stratum h, on gender j “S” and age range “E” (15 years old or more and below 15 years old). PhijSE = Number of people in dwelling j, section i and stratum h, on gender “S” and age range “E” (above or equal to 15 years old and below 15 years old) in the labour force. _____________________________________________________________________________________ 26 NATIONAL STATISTICS INSTITUTE Estimators Estimators connected to this design are not self-weighted due to the set number of dwellings selected by section. Estimates are conducted taking into account the entire sample and based on mobile quarters, which include the ongoing month and two previous months. Exogenous data on population projections used for calculations refers to the central month. The estimate of a given total for a variable is firstly obtained with the multiplication of the variable's value for each person by his factor of expansion and then added up on all the people in the sample. The estimated population by stratum, gender and age group is obtained by weighting each person by the theoretical expansion factor and then adding up all the people in the sample within the stratum, by gender and age range. Total estimator of people with condition c [ ] Pˆc = ∑∑∑ FEhi(1) ⋅ Phijc h i j Phijc = Total people in dwelling j, section i of stratum h with condition c Estimated rate (average) of condition b, by person with condition c Tˆbc = Pˆb Pˆc ; Pˆb = Total estimate o people with condition b Total estimated variance Vˆar ( Pˆc ) = nh ⋅∑ n h − 1 i∈h [F (1) i ⋅ p ic −R ˆ c ⋅ F i(1)⋅ p ic ] 2 Sampling error Sampling errors defined below are the Coefficient of Variation, Absolute Error and Relative Error. An estimate variation coefficient is obtained from the quotient between the estimate standard deviation (square root of the variance) and the estimate value. _____________________________________________________________________________________ 27 NATIONAL STATISTICS INSTITUTE Coefficient of vriation: Absolute eror: Relative eror: V ( Pc ) CVˆ ( Pc ) = Pˆc . AbsoluteError = 1,96 ⋅ V ( Pˆc ) . Re lativeError = AbsoluteError Pˆ Statistical confidence interval for estimator between 2 periods (mobile quarters) Once you know the estimator variance, it is possible to obtain the confidence interval for the total people, in the population, with condition “c” at a 95 % confidence level. Confidence interval at 95% ( Pˆ c − 1,96 × V(Pc ) ; Pˆc + 1,96 × V(Pc ) ) _____________________________________________________________________________________ 28 NATIONAL STATISTICS INSTITUTE BIBLIOGRAPHY Des Raj (1968), “Sampling Theory”. Cochran W. G. (1998), “Sampling Techniques”. Carl-Erik Sarndal, Bengt Swensson, Jan Wretman (1992), “Model Assisted Survey Sampling”. C. J. Skinner, D. Holt and T. M. Smith (1989), “Analysis of Complex Surveys”. _____________________________________________________________________________________ 29 NATIONAL STATISTICS INSTITUTE
© Copyright 2024