{"id":2414,"date":"2021-11-07T16:50:51","date_gmt":"2021-11-07T16:50:51","guid":{"rendered":"https:\/\/www.psyctc.org\/psyctc\/?post_type=docs&#038;p=2414"},"modified":"2024-03-05T14:17:31","modified_gmt":"2024-03-05T13:17:31","password":"","slug":"histograms-and-barplots","status":"publish","type":"docs","link":"https:\/\/www.psyctc.org\/psyctc\/glossary2\/histograms-and-barplots\/","title":{"rendered":"Histograms and barplots"},"content":{"rendered":"\n<p>A plot that plots the counts of observed values for a variable against the values.  Excellent for giving a picture of the &#8220;shape&#8221; of the distribution.  <\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Details<\/h4>\n\n\n\n<p>Traditionally a distinction is made between a histogram and a barchart with the former applying for continuous and the latte to discrete variables.  Here are two plots observing that distinction.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"885\" src=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/Histogram1001-1024x885.png\" alt=\"\" class=\"wp-image-2447\" srcset=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/Histogram1001-1024x885.png 1024w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/Histogram1001-300x259.png 300w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/Histogram1001-768x664.png 768w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/Histogram1001-1536x1328.png 1536w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/Histogram1001.png 1700w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>In the left hand plot the breakdown is a &#8220;barplot&#8221; by gender (with &#8220;NA&#8221; standing for &#8220;Not Answered and a binary gender classification only offered).  That the categories are distinct is signalled conventionally by the gaps between the columns and the heights of the columns showing the numbers choosing each gender category.  The right hand plot is a histogram from the same sample and shows age treated as continuous so without gaps between the vertical bars.  I am not convinced that the distinction is very helpful.  <\/p>\n\n\n\n<p>For example, sometimes age is categorised with only a set of age ranges offered, perhaps:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&lt;20\n20 to 29\n30 to 39\n40 to 49\n&gt;= 50 <\/code><\/pre>\n\n\n\n<p>Now the data break down like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-----------------------------\nAge               n   percent\n------------- ----- ---------\n&lt;20             148     14.9%\n\n20 to 29        545     55.1%\n\n30 to 39        131     13.2%\n\n40 to 49         80      8.1%\n\n50 and over      86      8.7%\n-----------------------------<\/code><\/pre>\n\n\n\n<p>and the histogram looks like this:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"885\" src=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAge-3-1024x885.png\" alt=\"\" class=\"wp-image-2452\" srcset=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAge-3-1024x885.png 1024w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAge-3-300x259.png 300w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAge-3-768x664.png 768w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAge-3-1536x1328.png 1536w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAge-3.png 1700w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>While the barplot looks like this:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"885\" src=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/BarAge-1024x885.png\" alt=\"\" class=\"wp-image-2449\" srcset=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/BarAge-1024x885.png 1024w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/BarAge-300x259.png 300w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/BarAge-768x664.png 768w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/BarAge-1536x1328.png 1536w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/BarAge.png 1700w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>That&#8217;s fair enough as it represents the categories, however, the histogram clearly represents the realities of age as a continuous variable better.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Histograms by multiple categories<\/h5>\n\n\n\n<p>Histograms and barplots are wonderful to convey visually the distributions of variables one variable at a time.  They can be used to look at how distributions of one variable differs (or not) by another categorical variable.  For example, here is age but taking gender into account as well.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"885\" src=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendStack-1024x885.png\" alt=\"\" class=\"wp-image-2455\" srcset=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendStack-1024x885.png 1024w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendStack-300x259.png 300w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendStack-768x664.png 768w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendStack-1536x1328.png 1536w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendStack.png 1700w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>There the counts, actually the proportion of the total, are &#8220;stacked&#8221; so the gender category with the fewest overlies the category with the next most in that age range and the category with the most in that age range sticks out at the top.  Alternatively, the genders can be put side by side as here (in R gglot jargon &#8220;dodged&#8221;).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"885\" src=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendDodge-1024x885.png\" alt=\"\" class=\"wp-image-2456\" srcset=\"https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendDodge-1024x885.png 1024w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendDodge-300x259.png 300w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendDodge-768x664.png 768w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendDodge-1536x1328.png 1536w, https:\/\/www.psyctc.org\/psyctc\/wp-content\/uploads\/2021\/11\/HistAgeGendDodge.png 1700w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Though these approaches can work, comparing distributions across a continuous variable broken down by another variable is often better done with a boxplot or violin plot.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Try also<\/h4>\n\n\n\n<p><a data-type=\"docs\" data-id=\"2371\" href=\"https:\/\/www.psyctc.org\/psyctc\/glossary2\/distribution\/\">Distribution<\/a><br><a data-type=\"docs\" data-id=\"2383\" href=\"https:\/\/www.psyctc.org\/psyctc\/glossary2\/gaussian-normal-distribution\/\">Gaussian (\u201cNormal\u201d) Distribution<\/a><br><a data-type=\"docs\" data-id=\"2277\" href=\"https:\/\/www.psyctc.org\/psyctc\/glossary2\/skew-skew-distribution\/\">Skew<\/a><br><a data-type=\"docs\" data-id=\"2469\" href=\"https:\/\/www.psyctc.org\/psyctc\/glossary2\/boxplot-or-box-plot\/\">Box plot (and boxplot!)<\/a><br><a data-type=\"docs\" data-id=\"2509\" href=\"https:\/\/www.psyctc.org\/psyctc\/glossary2\/violin-plot-or-violinplot\/\">Violin plot (and violinpl* <a href=\"https:\/\/shiny.psyctc.org\/apps\/Gaussian1\/\" target=\"_blank\" rel=\"noreferrer noopener\">App creating samples from Gaussian distribution<\/a> showing histogram, ecdf and qqplotot!)<\/a><br><a data-type=\"docs\" data-id=\"2373\" href=\"https:\/\/www.psyctc.org\/psyctc\/glossary2\/uniform-distribution\/\">Uniform distribution<\/a><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Chapters<\/h4>\n\n\n\n<p>Chapters 5, 7 and 8.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">External resources<\/h4>\n\n\n\n<p>Good if detailed <a href=\"https:\/\/en.wikipedia.org\/wiki\/Histogram\"><span style=\"text-decoration: underline;\">Wikipedia page<\/span><\/a> which itself links to a range of further resources.  It also has a link, not very well flagged up, to <a href=\"https:\/\/en.wikipedia.org\/wiki\/John_Graunt\"><span style=\"text-decoration: underline;\">John Graunt (1620 &#8211; 1674)<\/span><\/a> a founder of demography and tabulation and a useful reminder that while statistical methods and rich computer generated graphics are explosions of the 20th and 21st Centuries, the roots go back a long way.<\/p>\n\n\n\n<p>* My shiny <a href=\"https:\/\/shiny.psyctc.org\/apps\/Gaussian1\/\" target=\"_blank\" rel=\"noreferrer noopener\">app creating samples from Gaussian distribution<\/a> showing histogram, ecdf and qqplot<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Dates<\/h4>\n\n\n\n<p>Created 7\/11\/21, tweaks 5.iii.24.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A plot that plots the counts of observed values for a variable against the values. Excellent for giving a picture of the &#8220;shape&#8221; of the distribution. Details Traditionally a distinction is made between a histogram and a barchart with the former applying for continuous and the latte to discrete variables. Here are two plots observing &hellip; <a href=\"https:\/\/www.psyctc.org\/psyctc\/glossary2\/histograms-and-barplots\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Histograms and barplots<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","template":"","meta":{"footnotes":""},"doc_category":[18],"glossaries":[],"doc_tag":[],"knowledge_base":[],"class_list":["post-2414","docs","type-docs","status-publish","hentry","doc_category-om-book"],"year_month":"2026-04","word_count":465,"total_views":"1505","reactions":{"happy":"0","normal":"0","sad":"0"},"author_info":{"name":"chris","author_nicename":"chris","author_url":"https:\/\/www.psyctc.org\/psyctc\/author\/chris\/"},"doc_category_info":[{"term_name":"All OM book glossary entries","term_url":"https:\/\/www.psyctc.org\/psyctc\/glossary\/non-knowledgebase\/om-book\/"}],"doc_tag_info":[],"knowledge_base_info":[],"knowledge_base_slug":[],"_links":{"self":[{"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/docs\/2414","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/docs"}],"about":[{"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/types\/docs"}],"author":[{"embeddable":true,"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/comments?post=2414"}],"version-history":[{"count":11,"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/docs\/2414\/revisions"}],"predecessor-version":[{"id":3951,"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/docs\/2414\/revisions\/3951"}],"wp:attachment":[{"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/media?parent=2414"}],"wp:term":[{"taxonomy":"doc_category","embeddable":true,"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/doc_category?post=2414"},{"taxonomy":"glossaries","embeddable":true,"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/glossaries?post=2414"},{"taxonomy":"doc_tag","embeddable":true,"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/doc_tag?post=2414"},{"taxonomy":"knowledge_base","embeddable":true,"href":"https:\/\/www.psyctc.org\/psyctc\/wp-json\/wp\/v2\/knowledge_base?post=2414"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}