Introduction
Why immigrants are successful?
The U.S. is highlighted as the melting pot attracting a diverse population from all around the world. Thanks to the globalization, immigration becomes such an important topic in both social and political domains. Having various reasons, such as employment, family, and asylum, many people come to the United States for better life conditions. As the number of immigrants is increased, they become a huge part of the society making a huge impact on almost every sector of the society. On that note, it is important to look at how well they are living in a new environment. This project is going to dive into those immigrants’ success comparing them with native-born Americans and their national counterparts.
One thing that are often ignored is the slots available from origin countries. In Lazear’s paper, he argues that the success of immigrants are depends on not only their characteristics, but also the number of people who can migrate to destination country. Even though one country, like Japan, has higher education attainment, the immigrants from that country will have lower educational attainment or wage on average because there are simply more people migrating, which leads to lowering the average.(Lazear 2021) This is the reason why this project is taking their national average into account for a rigorous comparison.
So how do we define a success? What is a success for them? What is a good measure of a success? Kyeremeh et al. researched on a concept successful integration among African immigrants in Canada. The authors noted that the successful integration into a society measured in both qualitatively and quantitatively. As a part of ‘fitting in’ or ‘becoming a part of the host society’, the finding of this paper tells the successful integration is highlighted as a progress of being integrated. This progress includes “achieving pre-migration aspirations” or “creating avenues for personal growth and development.” (Kyeremeh et al. 2021) In this project, the wage and education attainment will be considered mostly measuring the success of immigrants.
Questions
Following will be the core questions that the project will answer throughout the analysis.
- How can we define a success for immigrants? Will it be different from native born Americans?
- Does a language proficiency in English affect immigrants’ success?
- Does their demographic information affect their success?
- Does their origin country affect their success?
- Do those indicators change over the years?
- Which indicator will be the most influential on their success?
- How does indicators relate to each other?
- What would be the barriers for immigrants to their success?
- Are those indicators different between origin country and the U.S.?
- Does naturalization application form reflect well on the factors that highly correlated with the success?
Project
Dealing with the questions above will discover which factors influence the most on immigrants’ success in the U.S. Those factors are including age, place of birth, English proficiency, marital status, gender, racial category, and employment status. Immigrants in this project are defined as people who were born outside of the U.S. and got naturalized after. The criteria of success are formulated in terms of their wage and education, which is making more money than 100,000 or attained at least Bachelor’s degree.
The process of this project will be following: Introduction, Preparation, Analysis and Conclusion. In the preparation process, data are gathered and cleaned for the integration of all datasets. Then there will be an exploratory data analysis looking at plots between immigrants and native-borns, and word cloud of text data from USCIS and MPI. For the analysis, 5 different data analysis methods will be applied and more details can be found on each tab. Finally, the conclusion will wrap up the project providing both technical and non-technical overall results along with limitation and possible future works.
Hypothesis
Addressed earlier in relevant prior research, it seems getting integrated into the society is the most influential aspect of being successful in the U.S. That is, English proficiency and employment status would be the most important for immigrants. Also there most likely will be significant differences of factors between immigrants based on their place of birth.
- English proficiency will be one of the most critical factor on immigrants’ success.
- Immigrants from developed countries will not be successful relatively compared to those from developing countries.
- Information needed for naturalization will be similar to what immigrants needed for their success.
Data
This project will analyze data from different sources in various perspectives. Having US Census data as a base, it will be focused on potential indicators of success. World Bank and OECD data will provide another dimension in terms of outside of the U.S. Furthermore, the application form of naturalization (N-400) will be a part of analysis whether it outlines those indicators well on the questions. The reports on immigrants from Migration Policy Institutions will be also included to make connections back to census data.
Methods
This comprehensive analysis contains both supervised and unsupervised learning.
- Naive Bayes Classification
- Decision Trees
- Clustering
- Dimensionality Reduction
- Association Rule Mining
Clustering and Dimensionality reduction will focus on countries using the data aggregated by the place of birth. Naïve Bayes, Decision Trees, and Association Rule Mining will be used for comparing immigrants and native-born Americans regarding their success.