4: Consolidating Datasets ( Challenge: Data Munging Using The Command Line)-白红宇

4: Consolidating Datasets ( Challenge: Data Munging Using The Command Line)

阅读量：6820 次

发布时间：2019-06-26

本文共 1483 字，大约阅读时间需要 4 分钟。

Looks good! Now finish the job by adding in the data from the other datasets.

Instructions

Append the remaining datasets in the order of the years they describe.
- Select all non-header rows from Hud_2007.csv and append to combined_hud.csv.
- Select all non-header rows from Hud_2013.csv and append to combined_hud.csv.

Display the last 10 rows of combined_hud.csv and verify that they match the last 10 rows of Hud_2013.csv

~$ wc -l Hud_2007.csv

~$ tail -42729 Hud_2007.csv >> combined_hud.csv

~$ wc -l Hud_2013.csv

~$ tail -64535 Hud_2013.csv >> combined_hud.csv

/home/dq$ wc -l Hud_2007.csv

42730 Hud_2007.csv

/home/dq$ tail -42729 Hud_2007.csv >> combined_hud.csv

/home/dq$ wc -l Hud_2013.csv

64536 Hud_2013.csv

/home/dq$ tail -64535 Hud_2013.csv >> combined_hud.csv

######################################################

5: Counting

Now that you have a consolidated dataset, you can start to answer basic questions on the entire dataset.

Instructions

Count and display the number of lines in combined_hud.csvcontaining 1980-1989

/home/dq$ grep '1980-1989' combined_hud.csv |wc -l

13672

###########################################################

6: Next Steps

In this challenge, you learned about a few useful commands for exploring files and practiced data munging from the command line. Next in this course is a guided project where you'll explore how to create Python scripts from the command line for more robust and reusable logic

转载于:https://my.oschina.net/Bettyty/blog/747173

你可能感兴趣的文章