5.7. Glossary¶

5.7.1. Definitions¶

Cell: Rectangular boxes containing text or code in a notebook.

Code Cell: A cell in Juypter lab that you can program in. It uses Python3 as its programming language.

Data Frame: Data frames are multidimensional arrays taken from a larger dataset. They are used to implement specific data operations that may not need the entire dataset. (In pandas it is called DataFrame)

Explicit Index: Uses the values (numeric or non-numeric) set as the index. For example, if we set a column or row as the index then we can use values in the row or column as indices in different panda methods.

Implicit Index: Uses the location (numeric) of the indices, similar to the python style of indexing.

Index: An Index is a value that represents a position (address) in the DataFrame or series.

Markdown: Markdown is a lightweight markup language that uses a plain text format which is used in programming to edit and present HTML, XHTML, pdf and other file types. Refer to the relevant appendix for more about Markdown.

Series: A series is an array of related data values that share a connecting factor or property.

Text Cell: A cell in Juypter lab that you can write text in. The text is written in a language called Markdown.

5.7.2. Keywords¶

import: Import lets programmers use packages, libraries or modules that have already been programmed.

<DataFrame>[<string>]: return the series corresponding to the given column (<string>).

<DataFrame>[<list of strings>]: returns a given set of columns as a DataFrame.

<DataFrame>[<series/list of Boolean>]: If the index in the given list is True then it returns the row from that same index in the DataFrame.

<DataFrame>.loc[ ]: Uses explicit indexing to return a DataFrame containing those indices and the values associated with them.

<DataFrame>.loc[<string1>:<string2>]: This takes in a range of explicit indices and returns a DataFrame containing those indices and the values associated with them.

<DataFrame>.loc[<string>]: Uses an explicit index and return the row(s) for that index value.

<DataFrame>.loc[<list/series of strings>]: Returns a new DataFrame containing the labels given in the list of strings.

<DataFrame>.iloc[ ]: Uses implicit indexing to return a DataFrame containing those indices and the values associated with them.

<DataFrame>.iloc[<index, range of indices>]: This takes in an implicit index (or a range of implicit indices) and returns a DataFrame containing those indices and the values associated with them.

<DataFrame>.set_index [<string)>]: Sets an existing column(s) with the <string> name as the index of the DataFrame.

<DataFrame>.head(<numeric>): Returns the first <numeric> element(s). If no parameter (<numeric>) is set then it will return the first five elements.

<pandas>.DataFrame(<data>): Used to create a DataFrame with the given data.

<pandas>.read_csv(): Used to read a csv file into a DataFrame.

<DataFrame>.set_index(<column>): Gets the values of the given column and sets them as indices. The output will be sorted in accending order based on the new indices.

<pandas>.to_numeric(): Converts what is inside the parenthesis into neumeric values.

<series>.str.startswith(<string>): .str.startswith() (in pandas) checks if a series contains a string(s) that starts with the given prarameter (<string>), and returns a boolean value (True or False).

<data frame>.sort_index(): Sorts the different objects in the DataFrame. By default, the DataFrame is sorted based on the first column in accending order.

You have attempted of activities on this page

Before you keep reading...

Before you keep reading...

5.7. Glossary¶

5.7.1. Definitions¶

5.7.2. Keywords¶