Get Started with Amundsen
Bootstrap a default version of Amundsen using Docker
The following instructions are for setting up a version of Amundsen using Docker.
Make sure you have at least 3GB available to docker. Install
Clone this repo and its submodules by running:$ git clone --recursive [email protected]:amundsen-io/amundsen.git
Enter the cloned directory and run:# For Neo4j Backend$ docker-compose -f docker-amundsen.yml up# For Atlas$ docker-compose -f docker-amundsen-atlas.yml up
Ingest provided sample data into Neo4j by doing the following: (Please skip if you are using Atlas backend)
- In a separate terminal window, change directory to the amundsendatabuilder submodule.
sample_data_loaderpython script included in
examples/directory uses elasticsearch client, pyhocon and other libraries. Install the dependencies in a virtual env and run the script by following the commands below:$ python3 -m venv venv$ source venv/bin/activate$ pip3 install -r requirements.txt$ python3 setup.py install$ python3 example/scripts/sample_data_loader.py
View UI at
http://localhost:5000and try to search
test, it should return some result.
- We could also do an exact matched search for table entity. For example: search
test_table1in table field and it return the records that matched.
Atlas Note: Atlas takes some time to boot properly. So you may not be able to see the results immediately
docker-compose up command.
Atlas would be ready once you’ll have the following output in the docker output
Amundsen Entity Definitions Created...
You can verify dummy data has been ingested into Neo4j by by visiting
MATCH (n:Table) RETURN n LIMIT 25in the query box. You should see two tables:
You can verify the data has been loaded into the metadataservice by visiting:
If the docker container doesn’t have enough heap memory for Elastic Search,
es_amundsenwill fail during
- docker-compose error:
es_amundsen | : max virtual memory areas vm.max_map_count  is too low, increase to at least 
- Increase the heap memory detailed instructions here
- Make entry
vm.max_map_count=262144. Save and exit.
- Reload settings
$ sysctl -p
- docker-compose error:
docker-amundsen-local.ymlstops because of
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Failed to create node environment, then
es_amundsencannot write to
chown -R 1000:1000 .local/elasticsearch
If when running the sample data loader you recieve a connection error related to ElasticSearch or like this for Neo4j:
Traceback (most recent call last): File "/home/ubuntu/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/neobolt/direct.py", line 831, in _connect s.connect(resolved_address) ConnectionRefusedError: [Errno 111] Connection refused
elastic searchcontainer stops with an error
max file descriptors  for elasticsearch process is too low, increase to at least , then add the below code to the file
ulimits: nofile: soft: 65535 hard: 65535
Then check if all 5 Amundsen related containers are running with
docker ps? Can you connect to the Neo4j UI at http://localhost:7474/browser/ and similarly the raw ES API at http://localhost:9200? Does Docker logs reveal any serious issues?