The US Census Bureau recently released Selected Population Tables based on the American Community Survey 2010 5-year estimates. These tables present data on income, migration, family structure, and many other subjects broken down by 392 different racial, ethnic, and ancestry groups. The files are difficult to work with because each state, sequence (collection of related subject tables) and group (race, ethnicity, or ancestry) is released as a separate file, for a total of ~1.5 million files. In order to facilitate working with these data, Center for Urban Research associate Lee Hachadoorian has just developed SQL scripts that can be used to import the data to a PostgreSQL database. They have been published to a git repository at https://github.com/leehach.
The repository contains two kinds of scripts: the data import scripts and the meta-scripts (or script-creation scripts). The data import scripts can be used directly and are probably easiest to understand and use. Using the ACS 2010 Selected Population Tables data dictionary, the data import scripts are created programmatically by the script creation functions, which are themselves created by the meta-scripts. The functions give the data manager more flexibility and control over the import process, and can be used or adapted for other Census data products. For example, the repository currently also has the import scripts for the main ACS 2010 5-year summary files.
These scripts are released under the GNU General Public License. Please feel free to use or modify, and please share with other Census data users. For questions or comments, please use GitHub or email Lee.Hachadoorian@gmail.com.