博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
4: Consolidating Datasets ( Challenge: Data Munging Using The Command Line)
阅读量:6820 次
发布时间:2019-06-26

本文共 1483 字,大约阅读时间需要 4 分钟。

hot3.png

Looks good! Now finish the job by adding in the data from the other datasets.

Instructions

  • Append the remaining datasets in the order of the years they describe.
    • Select all non-header rows from Hud_2007.csv and append to combined_hud.csv.
    • Select all non-header rows from Hud_2013.csv and append to combined_hud.csv.
  • Display the last 10 rows of combined_hud.csv and verify that they match the last 10 rows of Hud_2013.csv

~$ wc -l Hud_2007.csv

 ~$ tail -42729 Hud_2007.csv >> combined_hud.csv 

~$ wc -l Hud_2013.csv

~$ tail -64535 Hud_2013.csv >> combined_hud.csv

 

/home/dq$ wc -l Hud_2007.csv                                                    

42730 Hud_2007.csv                                                              

/home/dq$ tail -42729 Hud_2007.csv >> combined_hud.csv                          

/home/dq$ wc -l Hud_2013.csv                                                    

64536 Hud_2013.csv                                                              

/home/dq$ tail -64535 Hud_2013.csv >> combined_hud.csv                          

 

######################################################

 

5: Counting

Now that you have a consolidated dataset, you can start to answer basic questions on the entire dataset.

Instructions

  • Count and display the number of lines in combined_hud.csvcontaining 1980-1989

/home/dq$ grep '1980-1989' combined_hud.csv |wc -l                              

13672  

 

###########################################################

6: Next Steps

In this challenge, you learned about a few useful commands for exploring files and practiced data munging from the command line. Next in this course is a guided project where you'll explore how to create Python scripts from the command line for more robust and reusable logic

转载于:https://my.oschina.net/Bettyty/blog/747173

你可能感兴趣的文章