Pre-Tax Wage and Salary Income Inequalities in Largest Metropolitan Areas in the United States

The distribution of pre-tax wages and salaries for employed individuals between the ages of 18-65 in the ten largest metropolitan areas of the USA are studied in this paper using the American Community Survey data from 2019. The included metropolitan areas are Atlanta-Sandy Springs-Roswell, Chicago–Naperville-Elgin, Dallas-Fort Worth-Arlington, Houston-The Woodlands-Sugar Land, Los Angeles-Long Beach-Anaheim, Miami-Fort Lauderdale-West Palm Beach, New York-Newark-Jersey City, Philadelphia-Camden-Wilmington, San Francisco-Oakland-Hayward, and Washington-Arlington-Alexandria. These ten metropolitan areas employed over 39 million individuals representing well over a quarter of the total employed labour force in the USA. Mean, median, standard error of the mean, 25th percentile, 50th percentile, and the Gini coefficient of pre-tax wages and salaries are presented for each metropolitan area. The metros differ significantly in terms of average pre-tax wages and salaries. They differ significantly in terms of the spread in the distribution of pre-tax wages and salaries measured both in terms of the inter-quartile range (the difference between 75th and 25th percentiles) and the Gini coefficient. San Francisco-Oakland-Hayward is found to have both the highest average pre-tax wages and salaries and widest inequality as measured by the Gini coefficient. The Smallest Gini coefficient is observed in Washington-Arlington-Alexandria metropolitan area. Inequality measured in terms of the Gini coefficient is nearly 15% higher in San Francisco-Oakland-Hayward as compared to Washington-Arlington-Alexandria. The average pre-tax wages and salaries are about 83% higher in San Francisco-Oakland-Hayward than Miami-Fort Lauderdale-West Palm Beach, the lowest in the nation. While aggregate nationwide inequalities attract intense attention, these regional variations point to significant and wide-ranging variations between different regions (metropolitan cities). By focusing on the pre-tax wages and salaries, this study allows us to tie inequalities that are most closely related to the labour market conditions, unlike other sources of income like capital gains, inheritance, government transfers, etc. tie pulling indicators wages salaries distributions (like percentiles, Gini this provided a more view of in enabling


Introduction
Income inequality is one of the most widely discussed and pervasively used economic concepts in policy and business circles. The rise of income inequality in the second half of the 20 th century and first two decades of the 21 st century has attracted wide-ranging discussions in both press and academia (Arrow, 1998;Heckman, 1998;Bertrand and Mullainathan, 2004;Autor, Katz et al. 2008; Glaeser and Resseger, 2010; Acemoglu and Autor, 2011; David and Dorn, 2013). Widening income distribution in recent decades has also been blamed for increasing political unrest and demand for stronger re-distributive economic policies in many parts of the world. Many studies focus on the total income derived from various sources like wages and salaries, asset income, inheritance, government transfers, etc. Like asset income, inheritance, government transfers, etc., many of them may have very weak or no relationship with one's abilities and human capital or maybe only weakly related to the local labour market conditions. Therefore, to keep the noise as low as possible, this paper focuses only on the pre-tax wages and salaries, which are most likely to correlate with labor market conditions and skill demand. The skill premium has been identified as one of the key drivers of inequality in the USA ( for the same is quite straightforward. In a labour force with unequal skill levels, more skilled workers are likely to earn higher compensations as their productivities generally exceed their lower-skilled counterparts. Thus, a technological shift deepening the employment higher-skilled workers may end up increasing inequalities in the economy. This paper focuses on the ten largest metropolitan areas in the USA. These are not only the largest employment agglomerations in the country, but they are also known for their special roles in the national economy. For example, San Francisco area is the home to Silicon Valley and a major destination of tech-related employment in the world. New York is especially known for its financial industry, and Chicago may have one of the world's highest concentrations of commodity trading. Miami is a major tourist destination, and Washington, D.C. is the central hub of government and international economic policymaking, especially being the main location for institutions like the World Bank and the IMF. Analysis of the inequalities in pre-tax wage and salaries in these large metropolitan areas gives us a very granular and rich view of the inequalities in the overall American economy while pointing towards a close relationship between the local labour market conditions and specificities of skill demand in these diversified metropolitan cities (Moretti, 2012;Moretti, 2013). This paper finds significant and wide-ranging inequalities between these largest metro areas. Together, these metropolitan areas employed over 39.2 million individuals between the ages of 18-65. That is well over a quarter of the American labour force. One should note that the occupation and skill levels do not decompose the inequalities. San Francisco-Oakland-Hayward is found to have both the highest average pre-tax wages and salaries and the widest inequality as measured by the Gini coefficient. The Smallest Gini coefficient is observed in Washington-Arlington-Alexandria. Measured in terms of the Gini coefficient, inequalities in pre-tax wages and salaries are nearly 15% higher in San Francisco-Oakland-Hayward as compared to Washington-Arlington-Alexandria. The average pre-tax wages and salaries are found to be about 83% higher in San Francisco-Oakland-Hayward as compared to Miami-Fort Lauderdale-West Palm Beach, the lowest in the nation.

Literature Review
In a perfect world without discrimination, wage inequality can be reasonably expected if workers differ in their core abilities. If more productive workers are compensated at a higher rate compared to the low productivity workers, the overall wage inequality in the economy will be there and the magnitude of the same will depend on the distribution of skill among the labour force. If the cost of finding high productivity workers is too high (high search costs) or the competition for the high productivity labourers is too steep, the employers will have to try to maximize their profits by hiring a mix of high and low productivity workers. Thus, wage inequalities are perfectly plausible even without discrimination, animus, or bias (Levine, 1991;Ma, 1991;Bisin and Gottardi, 2006;Guerrieri, Shimer et al., 2010). Wage inequalities can get considerably exacerbated in the presence of systemic bias with or without animus. In the presence of animus towards a particular section of the workers (like racial or gender discrimination) the affected members may earn less compared to others. Without animus, if information acquisition is costly because of a lower representation from certain communities, statistical discrimination can worsen inequalities (Phelps, 1972;Becker, 2010).
The precise source of inequalities is often hard to pin down. Skill dispersion and technological advancement are probably the most benign ways where imbalances can happen even in a perfect information framework. The returns to various kinds of skills can also evolve as technology evolves over some time (Freeman and Katz, 1994;Heckman, 1998 Delgado and Stefancic, 2017). This paper does not delve into the question of inequality driven by racial or gender discrimination. Nor does it concern itself with the granularities of skill distribution is labour markets. Instead, this paper looks at the overall inequalities in pre-tax wages and salaries in various metropolitan cities in the USA.

Methodology and Research Methods
Data from the American Community Survey (ACS) for the year 2019 is used in this paper. Harmonized data is extracted from IPUMS USA. ACS is the main source of national demographic data between decennial census years. It is conducted by the US Census Bureau and ACS is conducted every year. Data from ACS is widely used understanding migration, income, wages, demographic variables, educational attainment for American population (Ruggles et al., 2021). Ten largest metropolitan areas are chosen for this study. These metropolitan areas represent the place of work for the sample members and not necessarily the place of the primary residence of the sampled individuals. This choice is made to ensure that the wage conventions of the local labour markets apply to the individuals. Many individuals live and work in the same metropolitan area while others do not. For those individuals who do not live in the in the metropolitan areas below but work in them then they are identified by their work-related metro areas. The metropolitan areas included in the study are: Some of the metropolitan areas are entirely contained in a single state and other times. They may spread over multiple neighboring states. For example, the Atlanta-Sandy Springs-Roswell metropolitan area is wholly contained in Georgia, while the Philadelphia-Camden-Wilmington metropolitan area is spread over three states: Pennsylvania, New Jersey, and Delaware. New York-Newark-Jersey City metropolitan area is also distributed over parts of three states: New York, New Jersey, and Pennsylvania. However, San Francisco-Oakland-Hayward is entirely contained in the state of California. The state(s) in which the metropolitan areas are located are also provided alongside the metro names. The sample for this study was restricted to all employed individuals between the ages of 18 and 65. Total pre-tax wage and salary income received as an employee over the preceding 12 months is included in the study. These incomes are measured in 2019 US Dollars. Such compensation includes wages, salaries, commissions, cash bonuses, tips, and other money income while working for an employer. Payments-in-kind or reimbursements for business expenses are not included in the pre-tax wage and salary income calculations. Observations with missing or non-positive pretax wages and salaries are excluded from the analysis. Observations with missing metropolitan identifier are also excluded from the study.
To ensure that extreme values (both low and high) do not bias the analysis, the pre-tax wages and salary incomes are Winsorized to remove the lowest 2.5% and highest 2.5% of the figures leaving the middle 95% of the pre-tax wage and salary income earners in the sample. Such truncations are performed on a metro-bymetro level to allow for individual metro level specificities. This careful adjustment is especially important given the fact that pre-tax wage and salary incomes often vary greatly between one metropolitan area to another. To ensure population level representation, the samples are weighted by the provided sample weights. The pre-tax wage and salary incomes are used to calculate the mean, median, 25 th percentile, 75 th percentile, and Gini coefficient for each of the individual metropolitan areas. Standard errors of the means are also calculated and provided for easy comparison between the metros. Gini coefficient may be mathematically defined as half of the relative mean absolute difference based on Lorenz curve. The mean absolute difference may in turn be defined as the average absolute difference of all pairs of pre-tax wage and salary incomes of the populations for each of the metros. The relative mean absolute difference may be defined as the mean absolute difference divided by the average of the pre-tax wage and salary incomes. This adjustment is done to normalize for scale and make the inter-metro comparison of Gini coefficient possible (Yitzhaki, 1979;Lambert and Aronson, 1993;Milanovic, 1997).
When is the pre-tax wage and salary income of person then the Gini coefficient may be formally defined for discrete distribution as For continuous distribution, the Gini coefficient may be formally defined as or, the mean of the distribution.
Furthermore, Gini coefficient (G) is 0 ≤ ≤ 1. Note that G=0 represents perfect equality and G=1 represents extreme inequality characterized by a situation where the entire pre-tax wage and salary belongs to a single individual. A higher value of the Gini coefficient points to more pervasive economic inequality.

Results
Weighted sample (population) size, mean pre-tax wages and salaries along with the standard error of the same, and the median are presented in Table 1 By weighting the sample by the provided sample weights, quite precise estimates of average are obtained in that the standard errors of the sample means are quite minor for all the metro areas. All of them are below $100. That is well less than 0.5% for all the metropolitan areas studied in the paper. It may also be noted that the median of pre-tax wages and salaries is smaller than the mean pre-tax wages and salaries for all metropolitan cities. They are also distributed quite widely over the metros. For example, the median pre-tax wages and salaries for Miami-Fort Lauderdale-West Palm Beach is only about 58% of San Francisco-Oakland-Hayward or Washington-Arlington-Alexandria. In other words, the median pre-tax wages and salaries of San Francisco-Oakland-Hayward or Washington-Arlington-Alexandria is over 71% larger than that in Miami-Fort Lauderdale-West Palm Beach. It may be noted that the median pre-tax wages and salaries of New York-Newark-Jersey City and Los Angeles-Long Beach-Anaheim are about 17% and 33% lower compared to San Francisco-Oakland-Hayward or Washington-Arlington-Alexandria.
One may also note that the distribution of pre-tax wages and salaries is found to be positively skewed for each of the metros considered in the paper. It can be seen simply by noting that the mean of pre-tax wages and salaries is higher than the median of pre-tax wages and salaries for all the metros. It is expected for these metros since all of them are major centers of economic activities and are quite likely to have the upper end of the distribution receiving significantly more pre-tax wages and salaries than the lower end of the distribution. One may also note that the median is not affected by the movement in the upper end of the distribution. Also, having the same median does not necessarily imply that the units have the same mean. For example, San Francisco-Oakland-Hayward, CA and Washington-Arlington-Alexandria, DC-VA share the same median pretax wages and salaries, although the mean pre-tax wages and salaries are nearly 19% higher in San Francisco-Oakland-Hayward compared to Washington-Arlington-Alexandria. Distributional measures of pre-tax wages and salaries are presented in Table 2. The 25th percentile, 75th percentile and Gini coefficient are presented. One may note that the inter-quartile range is given by the difference between the 75th percentile and the 25th percentile. The most comprehensive inter-quartile range is observed in the metropolitan area of San Francisco-Oakland-Hayward, CA. Inequality, as measured by the Gini coefficient, is also highest in that metro. The lowest imbalance is observed in the metro area of Washington-Arlington-Alexandria, DC-VA. Inequality measured in terms of the Gini coefficient is nearly 15% higher in San Francisco-Oakland-Hayward than Washington-Arlington-Alexandria. Following the discussion above, we see that the spread of pre-tax wages and salaries plays an essential role in determining inequality. However, the median may remain the same across units. For example, San Francisco-Oakland-Hayward, CA and Washington-Arlington-Alexandria, DC-VA share the same median pre-tax wages and salaries. However, they do not have the same inequality levels. They share the same for the lower end of the distribution as the 25th percentile of pre-tax wages and salaries is $30,000 for both the metros. The difference in inequality between them is not so much due to the lower end of the distribution as in the upper end of the distribution. The 75th percentile of pre-tax wages and salaries are 10% higher in San Francisco-Oakland-Hayward compared to Washington-Arlington-Alexandria. As a result, although the lower end of the distribution remains comparable across these two metros, inequality, as measured by the Gini coefficient, is nearly 15% higher in San Francisco-Oakland-Hayward compared to Washington-Arlington-Alexandria.
Very interesting comparisons can also be made between Chicago-Naperville-Elgin and Philadelphia-Camden-Wilmington metros. They have a similar 25th percentile and identical 75th percentile of pre-tax wages and salaries. They also have similar mean and median pre-tax wages and salaries. However, as measured by the Gini coefficient, inequality is nearly 2% higher in Chicago-Naperville-Elgin compared to the Philadelphia-Camden-Wilmington metro. One can note that the inter-quartile range is larger in Atlanta-Sandy Springs-Roswell compared to Miami-Fort Lauderdale-West Palm Beach.
Furthermore, both the mean and median pre-tax wages and salaries are about 20% higher in Atlanta-Sandy Springs-Roswell than Miami-Fort Lauderdale-West Palm Beach. Despite that, these two metropolitan areas share the same inequality as measured by the Gini coefficient. The example between these two metropolitan areas shows that a larger inter-quartile range does not necessarily translate into a higher inequality as measured by the Gini coefficient. Furthermore, the comparison between the two metropolitan areas also exhibits that a poorer area does not necessarily have lower inequality. Although Miami-Fort Lauderdale-West Palm Beach is poorer compared to Atlanta-Sandy Springs-Roswell by numerous measures like the 25th percentile, 75th percentile, mean, and median, still they share the inequality as measured by the Gini coefficient.

Conclusions
Inequalities in pre-tax wages and salaries in the ten largest metropolitan areas of the USA are studied in the paper. The cities are large and complex, and they are distributed all over the country. These cities represent the key strengths of the American economy from the financial sector to technology, to tourism, media and entertainment, commodity, and futures trading, etc. Together, these represent the very backbone of the American economy and labor market. The Smallest Gini coefficient is observed in Washington-Arlington-Alexandria metropolitan area. Inequality measured in terms of the Gini coefficient is nearly 15% higher in San Francisco-Oakland-Hayward than Washington-Arlington-Alexandria. The average pre-tax wages and salaries are about 83% higher in San Francisco-Oakland-Hayward than Miami-Fort Lauderdale-West Palm Beach, the lowest in the nation. Cities are often found remarkably similar in terms of the distribution at the middle and lower end of the pre-tax income distribution. However, significant disparities remain at the upper end of the distribution. That leads to cities like Philadelphia and Chicago having different inequalities (as measured through the Gini coefficient) even though they share similar middle and lower-end distributions. San Francisco and Washington, DC areas have the widest inter-quartile range for pre-tax wages and salaries distribution. The inter-quartile range is the narrowest in Miami-Fort Lauderdale-West Palm Beach, FL. While aggregate inequalities attract intense attention, these regional variations point to significant and wide-ranging variations between different regions in the nation. By focusing on the pre-tax wages and salaries, this study allows us to tie inequalities that are most closely related to the labour market conditions, unlike other sources of income like capital gains, inheritance, government transfers, etc. Furthermore, by pulling together multiple indicators of pre-tax income wages and salaries distributions (like mean, median, 25th and 75th percentiles, inter-quartile range, and Gini coefficient), this paper provided a more granular view of labour income in major American cities enabling us to compare them in several income distribution-related parameters.
Funding. There is no funding for this research.