An outlier is a data point that is far away from the rest of the dataset. Outliers do not have a formal definition, but are easy to determine by looking at histogram. The histogram below shows an example of an outlier. There is one datapoint that is much larger than the rest.


If you see an outlier in your dataset, it’s worth reporting and investigating. This data can often indicate an error in your data or an interesting insight.



Using the following lines of code, we found the values for the outlier of our dataset. It is the same value as our maximum. Replace the relevant ????????? with the maximum value in summary.txt, and the hospital it correspends to (see below).

The maximum is $81,083:

cp_data[' Average Covered Charges '].max()



The hospital that charges the most for a chest pain emergency is Bayonne Hospital Center in New Jersey.

Sign up to start coding

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?