Databricks Notebooks¶

Agenda¶

The agenda for this will be

  1. What is Databricks Notebooks and why are they useful.
  2. The course material.

Databricks Home¶

We will be using the Databricks Notebook Community Edition.

No description has been provided for this image

Course Material¶

Code Cells¶

A code cell can evaluate code written in it.

Click No description has been provided for this image to run the cell. You can run the entire notebook by clicking No description has been provided for this image at the top.

Markdown¶

Write markdown in a code cell to document the steps in your notebook using the %md magic function.

Markdown: Text Formatting¶

This is the input


*This is Italic text*
**This is Bold text**
# First-level header
## Second-level header

This is the output


This is Italic text
This is Bold text

First-level header

Second-level header

Markdown: Math with Latex¶

This is the input


LaTeX math expressions inside a pair of dollar signs $\alpha+\beta$ is inline.

LaTeX math expressions inside a double pair of dollar signs is displayed

$$\frac{1}{n}\sum_{i=1}^nX_i$$

This is the output


LaTeX math expressions inside a pair of dollar signs $\alpha+\beta$ is inline.

LaTeX math expressions inside a double pair of dollar signs is displayed

$$\frac{1}{n}\sum_{i=1}^nX_i$$

More Magic Functions¶

  • %md Markdown
  • %sql SQL
  • %scala Scala
  • %python Python
  • %r R
  • %sh Shell
  • %fs Databricks File System
  • %run Run other notebooks

Run another Notebook¶

You can run another notebook from within a notebook.

Exercise 2.1.¶

  1. Sign up for Databricks Community Edition: https://docs.databricks.com/en/getting-started/community-edition.html (Important: Click: Get started with Community Edition in step 3.)
  2. On the left hand side click on the workspace buttom:
  3. Click on the Shared folder and then right click and select import. Click browse and select applied_data_science_with_pyspark.zip from the Desktop. No description has been provided for this image
  4. Click again on and open the notebook Shared/applied_data_science_with_pyspark/2_databricks_notebooks/2_1_exercise and start the exercise.