Skip to content

Commit cc176ce

Browse files
Update default templates (#108)
This updates the templates based on the latest CLI, using `scripts/update_templates.sh`.
1 parent 96f0b51 commit cc176ce

30 files changed

Lines changed: 175 additions & 364 deletions

default_python/README.md

Lines changed: 33 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,39 @@
22

33
The 'default_python' project was generated by using the default-python template.
44

5+
For documentation on the Databricks Asset Bundles format use for this project,
6+
and for CI/CD configuration, see https://docs.databricks.com/aws/en/dev-tools/bundles.
7+
58
## Getting started
69

7-
0. Install UV: https://docs.astral.sh/uv/getting-started/installation/
10+
Choose how you want to work on this project:
11+
12+
(a) Directly in your Databricks workspace, see
13+
https://docs.databricks.com/dev-tools/bundles/workspace.
14+
15+
(b) Locally with an IDE like Cursor or VS Code, see
16+
https://docs.databricks.com/vscode-ext.
17+
18+
(c) With command line tools, see https://docs.databricks.com/dev-tools/cli/databricks-cli.html
19+
20+
21+
Dependencies for this project should be installed using uv:
822

9-
1. Install the Databricks CLI from https://docs.databricks.com/dev-tools/cli/databricks-cli.html
23+
* Make sure you have the UV package manager installed.
24+
It's an alternative to tools like pip: https://docs.astral.sh/uv/getting-started/installation/.
25+
* Run `uv sync --dev` to install the project's dependencies.
1026

11-
2. Authenticate to your Databricks workspace, if you have not done so already:
27+
# Using this project using the CLI
28+
29+
The Databricks workspace and IDE extensions provide a graphical interface for working
30+
with this project. It's also possible to interact with it directly using the CLI:
31+
32+
1. Authenticate to your Databricks workspace, if you have not done so already:
1233
```
1334
$ databricks configure
1435
```
1536
16-
3. To deploy a development copy of this project, type:
37+
2. To deploy a development copy of this project, type:
1738
```
1839
$ databricks bundle deploy --target dev
1940
```
@@ -23,9 +44,9 @@ The 'default_python' project was generated by using the default-python template.
2344
This deploys everything that's defined for this project.
2445
For example, the default template would deploy a job called
2546
`[dev yourname] default_python_job` to your workspace.
26-
You can find that job by opening your workpace and clicking on **Workflows**.
47+
You can find that job by opening your workpace and clicking on **Jobs & Pipelines**.
2748
28-
4. Similarly, to deploy a production copy, type:
49+
3. Similarly, to deploy a production copy, type:
2950
```
3051
$ databricks bundle deploy --target prod
3152
```
@@ -35,17 +56,12 @@ The 'default_python' project was generated by using the default-python template.
3556
is paused when deploying in development mode (see
3657
https://docs.databricks.com/dev-tools/bundles/deployment-modes.html).
3758
38-
5. To run a job or pipeline, use the "run" command:
59+
4. To run a job or pipeline, use the "run" command:
3960
```
4061
$ databricks bundle run
4162
```
42-
6. Optionally, install the Databricks extension for Visual Studio code for local development from
43-
https://docs.databricks.com/dev-tools/vscode-ext.html. It can configure your
44-
virtual environment and setup Databricks Connect for running unit tests locally.
45-
When not using these tools, consult your development environment's documentation
46-
and/or the documentation for Databricks Connect for manually setting up your environment
47-
(https://docs.databricks.com/en/dev-tools/databricks-connect/python/index.html).
48-
49-
7. For documentation on the Databricks asset bundles format used
50-
for this project, and for CI/CD configuration, see
51-
https://docs.databricks.com/dev-tools/bundles/index.html.
63+
64+
5. Finally, to run tests locally, use `pytest`:
65+
```
66+
$ uv run pytest
67+
```

default_python/pyproject.toml

Lines changed: 7 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,26 +2,20 @@
22
name = "default_python"
33
version = "0.0.1"
44
authors = [{ name = "user@company.com" }]
5-
requires-python = ">= 3.11"
5+
requires-python = ">=3.10,<=3.13"
66

7-
[project.optional-dependencies]
7+
[dependency-groups]
88
dev = [
99
"pytest",
1010

11-
# Code completion support for DLT, also install databricks-connect
11+
# Code completion support for Lakeflow Declarative Pipelines, also install databricks-connect
1212
"databricks-dlt",
1313

1414
# databricks-connect can be used to run parts of this project locally.
15-
# See https://docs.databricks.com/dev-tools/databricks-connect.html.
16-
#
17-
# Note, databricks-connect is automatically installed if you're using Databricks
18-
# extension for Visual Studio Code
19-
# (https://docs.databricks.com/dev-tools/vscode-ext/dev-tasks/databricks-connect.html).
20-
#
21-
# To manually install databricks-connect, uncomment the line below to install a version
22-
# of db-connect that corresponds to the Databricks Runtime version used for this project.
23-
# See https://docs.databricks.com/dev-tools/databricks-connect.html
24-
# "databricks-connect>=15.4,<15.5",
15+
# Note that for local development, you should use a version that is not newer
16+
# than the remote cluster or serverless compute you connect to.
17+
# See also https://docs.databricks.com/dev-tools/databricks-connect.html.
18+
"databricks-connect>=15.4,<15.5",
2519
]
2620

2721
[tool.pytest.ini_options]

default_python/resources/default_python.job.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,6 @@ resources:
4040
# Full documentation of this spec can be found at:
4141
# https://docs.databricks.com/api/workspace/jobs/create#environments-spec
4242
spec:
43-
client: "2"
43+
environment_version: "2"
4444
dependencies:
4545
- ../dist/*.whl

default_python/resources/default_python.pipeline.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ resources:
88
serverless: true
99
libraries:
1010
- notebook:
11-
path: ../src/dlt_pipeline.ipynb
11+
path: ../src/pipeline.ipynb
1212

1313
configuration:
1414
bundle.sourcePath: ${workspace.file_path}/src

default_python/scratch/exploration.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
"sys.path.append(\"../src\")\n",
3333
"from default_python import main\n",
3434
"\n",
35-
"main.get_taxis(spark).show(10)"
35+
"main.get_taxis().show(10)"
3636
]
3737
}
3838
],

default_python/src/default_python/main.py

Lines changed: 4 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,13 @@
1-
from pyspark.sql import SparkSession, DataFrame
1+
from databricks.sdk.runtime import spark
2+
from pyspark.sql import DataFrame
23

34

4-
def get_taxis(spark: SparkSession) -> DataFrame:
5+
def find_all_taxis() -> DataFrame:
56
return spark.read.table("samples.nyctaxi.trips")
67

78

8-
# Create a new Databricks Connect session. If this fails,
9-
# check that you have configured Databricks Connect correctly.
10-
# See https://docs.databricks.com/dev-tools/databricks-connect.html.
11-
def get_spark() -> SparkSession:
12-
try:
13-
from databricks.connect import DatabricksSession
14-
15-
return DatabricksSession.builder.getOrCreate()
16-
except ImportError:
17-
return SparkSession.builder.getOrCreate()
18-
19-
209
def main():
21-
get_taxis(get_spark()).show(5)
10+
find_all_taxis().show(5)
2211

2312

2413
if __name__ == "__main__":

default_python/src/dlt_pipeline.ipynb

Lines changed: 0 additions & 90 deletions
This file was deleted.

default_python/src/notebook.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@
4646
"source": [
4747
"from default_python import main\n",
4848
"\n",
49-
"main.get_taxis(spark).show(10)"
49+
"main.find_all_taxis().show(10)"
5050
]
5151
}
5252
],

default_python/tests/main_test.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
from default_python.main import get_taxis, get_spark
1+
from default_python import main
22

33

4-
def test_main():
5-
taxis = get_taxis(get_spark())
4+
def test_find_all_taxis():
5+
taxis = main.find_all_taxis()
66
assert taxis.count() > 5

lakeflow_pipelines_python/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,7 @@ dist/
44
__pycache__/
55
*.egg-info
66
.venv/
7+
scratch/**
8+
!scratch/README.md
79
**/explorations/**
810
**/!explorations/README.md

0 commit comments

Comments
 (0)