Let's go back to your adult.data file and look at the first lines with the head command. There are many other applications for wc if you start adding wildcards and subfolders. The following command will tell you exactly how many file it contains: $ ls -l | wc -l You'll see it in several examples throughout this post.Īssume now you have a folder with many files. Using the output of a command as input to another command using the pipe symbol | is a useful shell pattern called pipelining. The wc command can also count the number of files in a directory by using the output of a simple ls -l command as input to wc. The adult.data file contains nearly 500k words. But you can also use wc to count words instead using the -w flag. Which tells you that adult.data file contains 32562 rows. This can be done with the word count wc -l command: $ wc -l adult.data Given a new text based file, you want to know how many lines it contains. Openoffice space delimited file download#Feel free to download the dataset to follow along! Count with wc data extension, it is a well-formatted CSV file. With 48842 rows and 14 attributes, it is not a large dataset by far but will be sufficient to illustrate the examples. This data set is commonly used to predict whether income exceeds \$50K/yr based on census data. The shell commands that are included in this blog post have been tested on bash on OS X (macOS) and should work with other shells and environments.Īll the following examples are based on the adult dataset from the UCI Machine Learning repository also known as "Census Income" dataset and available here. Zsh (and Oh My Zsh) is a popular and powerful alternative. ), with bash shell the most common as it is the default shell on OS X and major linux distributions. Note that there are different types of shell (bash, zsh. This post will give an overview of some shell commands that I use nearly every day. What's more, familiarity with a few simple shell command lines can go a long way in saving time and reducing frustration. However, most of these tasks could be carried out with a few lines of code. You inevitably spend a lot of time in frozen screens, restarts and long waits. Also, processing a large number of files in one batch often fails after a few hours because of some unexpected file anomaly. Opening them with standard spreadsheet applications, such as Excel, LibreOffice or OpenOffice, overloads a machine's memory. However, files are often text-based Comma Separated Values (CSV) files. There are many scenarios where you need to quickly analyze, modify and process large files, both in number and size.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |