s

A few random reference notes…

By Angela

August 20, 2021 in

Reading time: 6 minutes.

A place to keep some reference notes on various topics for now including Jupyter notebooks, Python magic functions, Markdown, working with files etc.

  • Pickling files in Python
  • Jupyter Notebooks: converting to other formats, viewing online, warning messages.
  • Embed pandas DataFrames as images in pdf and markdown files
  • Adding pandas DataFrame tables from a Jupyter notebook to a Markdown
  • Images
  • Navigating to icloud drive from Terminal
  • Adding multiple spaces to Markdown using Non-Breaking SPaces
  • Unicode Database unicodedata module
  • Python magic functions: Importing Python functions from another Jupyter notebook or script
  • Embedding iframes in a markdown post
  • Styling dataframes in Python
  • Python dir() command to see what is in the current scope and getting help

Pickling files in Python

When a pickled file is later loaded, any changes have been preserved. I found this quicker when working with the large dataframes. Datetime columns etc in the dataframe will be preserved when pickled and time is saved loading the pickled dataframe instead of loading the csv file back in and creating the dataframe from scratch.

import pickle

file2 = 'hourly2.pkl'
with open(file2, 'wb') as f:
    pickle.dump(df2, f)
    
with open(file2, 'rb') as fp:
    df2 = pickle.load(fp)mport pickle


Images

  • Using random images from “unsplash.com” in a webpage:

![splash image](https://source.unsplash.com/random/400x200)

  • Wikipedia commons images

For example here is a link to a Scikit-learn machine learning decision tree image on https://commons.wikimedia.org/


cd ~/Library/Mobile\ Documents/com~apple~CloudDocs/


Jupyter notebooks

  • to convert notebooks to another format: $ jupyter nbconvert my_notebook.ipynb --to html --template basic --output output.html

  • To render Jupyter notebooks that might be slow to load in GitHub paste the url into https://nbviewer.jupyter.org/

  • To ignore deprecation warnings in Jupyter when loading older notebooks that are not used in a virtual environment

import warnings
warnings.filterwarnings('ignore')

jupyter nbconvert index.ipynb --to markdown --TemplateExporter.exclude_input=True --NbConvertApp.output_files_dir=.


Adding pandas DataFrame tables from a Jupyter notebook to a Markdown post.

A pandas DataFrame raw HTML code can be simply copied into a Markdown post or alternatively the Jupyter notebook can be converted to markdown using the nbconvert commands. The pandas DataFrame HTML includes a style tag containing styling for the dataframe. However tables in a markdown converted from a Jupyter notebook may not include the table data as it is raw HTML and instead just show the div and style tags.

You can instead create a shortcode containing {{.Inner} and then wrap the table in the opening and closing shortcode. Simply wrapping the entire dataframe HTML in the shortcode will render the HTML code and display the table using the dataframe style tags.


<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">

To change the styling you can replace the style tags with a bootstrap table class such as “table-sm” “table-striped” “table-responsive-sm” etc.

There is probably another way… There is a package that allows you to print dataframe tables as images in a notebook.


Embed pandas DataFrames as images in pdf and markdown files when converting from Jupyter Notebooks

When converting Jupyter Notebooks to pdf using nbconvert, pandas DataFrames appear as either raw text or as simple LaTeX tables. This package can be used to embed Dataframes into pdf and markdown documents as images so they appear exactly as they do in a Jupyter notebook.

https://pypi.org/project/dataframe-image/

import dataframe_image as dfi
    dfi.export(df_styled, 'df_styled.png')

Styling dataframes in Python

  • To set a caption: .style.set_caption('A caption to set for a dataframe)

  • To set a caption in bold style.set_caption('<b>A bold caption</b>')

  • To set a background style to a dataframe .style.background_gradient('cividis')


Adding multiple spaces to Markdown using Non-Breaking SPaces (' `)

HTML renders multiple spaces into a single space. To keep multiple spaces use &nbsp; code. When the Markdown is converted to HTML the multiple spaces will be preserved.


Python

import os
os.listdir('data')

The negation operator ~

df1 = df[~df['Country'].isin(regions) & (df['Indicator']=='Population')]

Formatting

f'The population of {country} in {year} was {population:,.0f}

where the comma , indicates how thousands are to be separated, the . indicates how to format decimal places. The number after the dot . indicates the number of decimal places. The f indicates that this is referring to floats.


Unicode Database unicodedata module

Flags / Emojis

from unicodedata import lookup

If you have the two-letter code of a country such as ie for Ireland: lookup the REGIONAL INDICATOR SYMBOL LETTER from the inicodedata module for each letter, then concatenate to create the flag symbol.

lookup('REGIONAL INDICATOR SYMBOL LETTER i') + lookup('REGIONAL INDICATOR SYMBOL LETTER e')


Missing values

  • .isna() method returns a Series of booleans for each column.
  • .isna().mean() gets the percentage of missing values per column, as a Series
  • .isna().mean().mean() gets the overall percentage of missing values across all columns

dir()

Python: Using dir() see what is in the current scope and getting help

  • dir() is a built-in function or method

  • dir? to get the docstring.

dir([object]) -> list of strings

  • dir() without an argument will return the names in the current scope

  • Otherwise if dir() is called with an argument, it will return an alphabetized list of names comprising (some of) the attributes of the given object, and of attributes reachable from it.

  • If the object supplies a method named __dir__, it will be used; otherwise the default dir() logic is used and returns:

    • for a module object: the module’s attributes.
    • for a class object: its attributes, and recursively the attributes of its bases.
    • for any other object: its attributes, its class’s attributes, and recursively the attributes of its class’s base classes.

Using a list comprehension to list all the names:

For example to see all attributes in the object ‘df’ that contains the string “reset”: [d for d in dir(df) if 'reset' in d]

['_reset_cache', '_reset_cacher', 'reset_index']

Tab key

In Jupyter to find the methods available on any of the objects in the current scope use the tab key.

For example if the variable df (holding a pandas DataFrame) is in the current scope, then df. and tab key will show all the available functions.

If the os module is in the current scope, the os. and tab will show all the available modules, classes, instances and functions.


Styling dataframe

See Pandas Style

Styler objects are returned by pandas.DataFrame.style.

For example: df.style.set_caption('<b>A bold caption for the dataframe df <b>')


Embedding iframes

https://getbootstrap.com/docs/5.1/helpers/ratio/

Embed plotly html files in an iframe and use a shortcode to display the raw html.

See https://getbootstrap.com/docs/5.1/helpers/ratio/ for setting ratios.


Importing Python functions from another Jupyter notebook or script

  • The magic function %load replaces the contents of the cell with an external script from either a local source or from a URL.
  • %%writefile to save the content of that cell to an external file.
  • %pycat will show the contents in a popup of an external file
  • %run can run the contents of another file. -!ls *.py to list all Python .py scripts in the directory

So if you have a function written in another notebook you can import it into another notebook.

  • %%writefile my-script.py
  • %load my_script.py

How to render maths equations in Hugo.

I came across this post by Goeff Ruddock which worked perfectly.

  • Render LaTeX math expressions in Hugo with MathJax 3

  • Create a layouts/partials/mathjax_support.html file in your themes directory

  • Add a line containing partial "mathjax_support.html" . in curly braces before the closing head tag in the layouts/partials/header.html. I have separate partials for the head and header sections so I put this into the head partial.

  • Add some code to the CSS style file for code.has-jax.

$$\int_{a}^{b} x^2 dx$$


from IPython.display import display, HTML

from IPython.core.display import HTML