Get Started with Amundsen
Bootstrap a default version of Amundsen using Docker
The following instructions are for setting up a version of Amundsen using Docker.
Make sure you have at least 3GB available to docker. Install
docker
anddocker-compose
.Clone this repo and its submodules by running:
$ git clone --recursive [email protected]:amundsen-io/amundsen.gitEnter the cloned directory and run:
# For Neo4j Backend$ docker-compose -f docker-amundsen.yml up# For Atlas$ docker-compose -f docker-amundsen-atlas.yml upIngest provided sample data into Neo4j by doing the following: (Please skip if you are using Atlas backend)
- In a separate terminal window, change directory to the amundsendatabuilder submodule.
sample_data_loader
python script included inexamples/
directory uses elasticsearch client, pyhocon and other libraries. Install the dependencies in a virtual env and run the script by following the commands below:
$ python3 -m venv venv$ source venv/bin/activate$ pip3 install -r requirements.txt$ python3 setup.py install$ python3 example/scripts/sample_data_loader.pyView UI at
http://localhost:5000
and try to searchtest
, it should return some result.We could also do an exact matched search for table entity. For example: search
test_table1
in table field and it return the records that matched.
Atlas Note: Atlas takes some time to boot properly. So you may not be able to see the results immediately
after docker-compose up
command.
Atlas would be ready once you’ll have the following output in the docker output Amundsen Entity Definitions Created...
Verify setup
You can verify dummy data has been ingested into Neo4j by by visiting
http://localhost:7474/browser/
and runMATCH (n:Table) RETURN n LIMIT 25
in the query box. You should see two tables:hive.test_schema.test_table1
hive.test_schema.test_table2
You can verify the data has been loaded into the metadataservice by visiting:
Troubleshooting
If the docker container doesn’t have enough heap memory for Elastic Search,
es_amundsen
will fail duringdocker-compose
.- docker-compose error:
es_amundsen | [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
- Increase the heap memory detailed instructions here
- Edit
/etc/sysctl.conf
- Make entry
vm.max_map_count=262144
. Save and exit. - Reload settings
$ sysctl -p
- Restart
docker-compose
- Edit
- docker-compose error:
If
docker-amundsen-local.yml
stops because oforg.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Failed to create node environment
, thenes_amundsen
cannot write to.local/elasticsearch
.chown -R 1000:1000 .local/elasticsearch
- Restart
docker-compose
If when running the sample data loader you recieve a connection error related to ElasticSearch or like this for Neo4j:
Traceback (most recent call last):
File "/home/ubuntu/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/neobolt/direct.py", line 831, in _connect
s.connect(resolved_address)
ConnectionRefusedError: [Errno 111] Connection refused
- If
elastic search
container stops with an errormax file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
, then add the below code to the filedocker-amundsen-local.yml
in theelasticsearch
definition.
ulimits:
nofile:
soft: 65535
hard: 65535
Then check if all 5 Amundsen related containers are running with docker ps
? Can you connect to the Neo4j UI at http://localhost:7474/browser/ and similarly the raw ES API at http://localhost:9200? Does Docker logs reveal any serious issues?
Author
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
What is Docker?
Introduction to Docker - Article
What is a Docker Container - A Complete Guide
Learn what Docker files, images, and containers are and how to create them. - Article
Docker Compose Tutorial
Learn how to set up and manage multi-container Docker applications using Docker Compose, including configuring services, environment variables, and more.
Learn more on Codecademy
- Free course
Working with Containers: Introduction to Docker
Get hands-on with Docker! Explore containers, Dockerfiles, workflows, and use cases while learning resource management for consistent, scalable deployment.Beginner Friendly1 hour - Free course
Microsoft Azure Fundamentals: Azure Containers
Learn container basics, Docker management, and Azure Container Instances deployment.Intermediate1 hour