• Complain

Jeroen Janssens [Jeroen Janssens] - Data Science at the Command Line

Here you can read online Jeroen Janssens [Jeroen Janssens] - Data Science at the Command Line full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2014, publisher: O’Reilly Media, Inc., genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Jeroen Janssens [Jeroen Janssens] Data Science at the Command Line

Data Science at the Command Line: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Data Science at the Command Line" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. Youll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.

Jeroen Janssens [Jeroen Janssens]: author's other books


Who wrote Data Science at the Command Line? Find out the surname, the name of the author of the book and a list of all author's works by series.

Data Science at the Command Line — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Data Science at the Command Line" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Appendix A. List of Command-Line Tools

This is an overview of all the command-line tools discussed in this book. This includes binary executables, interpreted scripts, and Bash builtins and keywords. For each command-line tool, the following information, when available and appropriate, is provided:

  • The actual command to type at the commandline

  • A description

  • The name of the package it belongs to

  • The version used in the book

  • The year that version was released

  • The primary author(s)

  • A website to find more information

  • How to install it

  • How to obtain help

  • An example usage

All command-line tools listed here are included in the Data Science Toolbox for Data Science at the Command Line. See for instructions on how to set it up. The install commands assume that youre running Ubuntu 14.04. Please note that citing open source software is not trivial, and that some information may be missing or incorrect.

alias

Define or display aliases. Alias is a Bash builtin.

$ help alias$ alias ll='ls -alF'
awk

Pattern scanning and text processing language. Mawk (version 1.3.3) by Mike Brennan (1994). http://invisible-island.net/mawk.

$ sudo apt-get install mawk$ man awk$ seq 5 | awk '{sum+=$1} END {print sum}'15
aws

Manage AWS Services such as EC2 and S3 from the command line. AWS Command Line Interface (version 1.3.24) by Amazon Web Services (2014). http://aws.amazon.com/cli.

$ sudo pip install awscli$ aws help$ aws ec2 describe-regions | head -n 5{ "Regions": [ { "Endpoint": "ec2.eu-west-1.amazonaws.com", "RegionName": "eu-west-1"
bash

GNU Bourne-Again SHell. Bash (version 4.3) by Brian Fox and Chet Ramey (2010). http://www.gnu.org/software/bash.

$ sudo apt-get install bash$ man bash
bc

Evaluate equation from standard input. Bc (version 1.06.95) by Philip A. Nelson (2006). http://www.gnu.org/software/bc.

$ sudo apt-get install bc$ man bc$ echo 'e(1)' | bc -l2.71828182845904523536
bigmler

Access BigMLs prediction API. BigMLer (version 1.12.2) by BigML (2014). http://bigmler.readthedocs.org.

$ sudo pip install bigmler$ bigmler --help
body

Apply an expression to all but the first line. Useful if you want to apply classic command-line tools to CSV files with a header. Body by Jeroen H.M. Janssens (2014). https://github.com/jeroenjanssens/data-science-at-the-command-line.

$ git clone https://github.com/jeroenjanssens/data-science-at-the-command-line.git$ echo -e "value\n7\n2\n5\n3" | body sort -nvalue2357
cat

Concatenate files and standard input, and print on standard output. Cat (version 8.21) by Torbjorn Granlund and Richard M. Stallman (2012). http://www.gnu.org/software/coreutils.

$ sudo apt-get install coreutils$ man cat$ cat results-01 results-02 results-03 > results-all
cd

Change the shell working directory. Cd is a Bash builtin.

$ help cd$ cd ~; pwd; cd ..; pwd/home/vagrant/home
chmod

Change file mode bits. We use it to make our command-line tools executable. Chmod (version 8.21) by David MacKenzie and Jim Meyering (2012). http://www.gnu.org/software/coreutils.

$ sudo apt-get install coreutils$ man chmod$ chmod u+x experiment.sh
cols

Apply a command to a subset of the columns and merge the result back with the remaining columns. Cols by Jeroen H.M. Janssens (2014). https://github.com/jeroenjanssens/data-science-at-the-command-line.

$ git clone https://github.com/jeroenjanssens/data-science-at-the-command-line.git$ < iris.csv cols -C species body tapkee --method pca | header -r x,y,species
cowsay

Generate an ASCII picture of a cow with a message. Useful for when building up a particular pipeline is starting to frustrate you a bit too much. Cowsay (version 3.03+dfsg1) by Tony Monroe (1999).

$ sudo apt-get install cowsay$ man cowsay$ echo 'The command line is awesome!' | cowsay ______________________________< The command line is awesome! > ------------------------------ \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
cp

Copy files and directories. Cp (version 8.21) by Torbjorn Granlund, David MacKenzie , and Jim Meyering (2012). http://www.gnu.org/software/coreutils.

$ sudo apt-get install coreutils$ man cp
csvcut

Extract columns from CSV data. Like cut command-line tool, but for tabular data. Csvkit (version 0.8.0) by Christopher Groskopf (2014). http://csvkit.readthedocs.org.

$ sudo pip install csvkit$ csvcut --help
csvgrep

Filter tabular data to only those rows where certain columns contain a given value or match a regular expression. Csvkit (version 0.8.0) by Christopher Groskopf (2014). http://csvkit.readthedocs.org.

$ sudo pip install csvkit$ csvgrep --help
csvjoin

Merge two or more CSV tables together using a method analogous to a SQL JOIN operation. Csvkit (version 0.8.0) by Christopher Groskopf (2014). http://csvkit.readthedocs.org.

$ sudo pip install csvkit$ csvjoin --help
csvlook

Renders a CSV file to the command line in a readable, fixed-width format. Csvkit (version 0.8.0) by Christopher Groskopf (2014). http://csvkit.readthedocs.org.

$ sudo pip install csvkit$ csvlook --help$ echo -e "a,b\n1,2\n3,4" | csvlook|----+----|| a | b ||----+----|| 1 | 2 || 3 | 4 ||----+----|
csvsort

Sort CSV files. Like the sort command-line tool, but for tabular data. Csvkit (version 0.8.0) by Christopher Groskopf (2014). http://csvkit.readthedocs.org.

$ sudo pip install csvkit$ csvsort --help
csvsql

Execute SQL queries directly on CSV data or insert CSV into a database. Csvkit (version 0.8.0) by Christopher Groskopf (2014). http://csvkit.readthedocs.org.

$ sudo pip install csvkit$ csvsql --help
csvstack

Stack up the rows from multiple CSV files, optionally adding a grouping value to each row. Csvkit (version 0.8.0) by Christopher Groskopf (2014). http://csvkit.readthedocs.org.

$ sudo pip install csvkit$ csvstack --help
csvstat

Print descriptive statistics for all columns in a CSV file. Csvkit (version 0.8.0) by Christopher Groskopf (2014). http://csvkit.readthedocs.org.

$ sudo pip install csvkit$ csvstat --help
curl

Download data from a URL. cURL (version 7.35.0) by Daniel Stenberg (2012). http://curl.haxx.se.

$ sudo apt-get install curl$ man curl
curlicue

Perform OAuth dance for curl. Curlicue by Decklin Foster (2014). https://github.com/decklin/curlicue.

$ git clone https://github.com/decklin/curlicue.git
cut

Remove sections from each line of files. Cut (version 8.21) by David M. Ihnat, David MacKenzie, and Jim Meyering (2012). http://www.gnu.org/software/coreutils.

$ sudo apt-get install coreutils$ man cut
display

Display an image or image sequence on any X server. Can read image data from standard input. Display (version 8:6.7.7.10) by ImageMagick Studio LLC (2009). http://www.imagemagick.org.

$ sudo apt-get install imagemagick$ man display
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Data Science at the Command Line»

Look at similar books to Data Science at the Command Line. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Data Science at the Command Line»

Discussion, reviews of the book Data Science at the Command Line and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.