pandas

pandas is a Python library for data manipulation and analysis.

Libaries

  • Pandas Bokeh a Bokeh plotting backend for Pandas and GeoPandas.

Notes

  • pandas.io.json.json_normalize is a function to normalize structured JSON into a flat dataframe. Useful for working with data that comes from an JSON API.

Snippets

Connect to a SQLite database

import pandas as pd
import sqlite

conn = sqlite3.connect("database.sqlite")
df = pd.read_sql_query("SELECT * FROM table_name;", conn)

df.head()

Using a SQLAlchemy engine to connect to a database

import pandas as pd
import numpy as np
from sqlalchemy import create_engine

engine = create_engine("postgresql:///database")

df = pd.read_sql_query("SELECT * FROM table;", con=engine)

df.head()

Python compatible column names with slugify

Usually I'm dealing with data from external sources that don't have pretty columns names. I like to use slugify to convert them to Python compatible keys.

from slugify import slugify

df.columns = [slugify(c, separator="_", to_lower=True) for c in df.columns]

Pandas/SQL Rosetta Stone

IN / pandas.DataFrame.isin

SELECT * FROM table WHERE city IN ("Toronto", "Richmond Hill");
# City is ether Toronto or Richmond Hill:
df[df['city'].isin(['Toronto', 'Richmond Hill'])]

# City is not Markdale or Oakville:
df[~df['city'].isin(['Markdale', 'Oakville'])]

See the pandas documentation for more information on pandas.DataFrame.isin.

results matching ""

    powered by

    No results matching ""