Youth Crime, An Epidemic or Sniffle: Analyzing 2019-2022 Crime in QLD

I’ve attached a primer on basic statistics as Appendix 1. If the reader does not come from a data background I strongly suggest reading it.

Introduction

Youth crime in Queensland has been called a crisis[1], a beatup, something to be cracked down on[2], and everything between. Part of the problem with analysing youth crime is the way data is presented, it’s not all tracked the same way – and it’s not always easy to ‘intersect’ different charts and tables.

The two main sources of data on Youth Crime in Queensland are the Australian Bureau of Statistics in their annual crime reports, and the Queensland Government Statistician’s Office in their annual crime reports[3]. The data presented by these agencies and the way it is presented is what informs the public of broad Youth Justice trends, and sets the stage for arguments around it.

I have chosen to rely on the QLD Annual Crime Reports. As these have come out giving data from 2019-2022, this is the period I will focus on.

As always the excel files are attached (as the data is public). See Appendix 3 for copies of all the relevant files.

The way data is presented

The reason for this article is some of the existing data on youth crime is difficult to read or understand. As such, many parties have jumped on both sides of the argument. Take for example this plot [QUEENSLAND TREASURY Crime report, Queensland, 2019–22, Queensland Government Statisticians Office, https://www.qgso.qld.gov.au/statistics/theme/crime-justice/crime-justice-statistics/reported-crime]

I would argue this plot is hard to use. It doesn’t show the change over years, it’s hard to read and understand, and it’s generally confusing. What it does show is that outback crime occurs much more than in the city. Beyond that – it’s difficult to interpret. You could say there’s an outback crisis because the numbers are so high, or a skewed dataset because there’s so few people in the outback.

As such, I found it preferable to take apart the data and rebuild some graphs. I think this allows for a simpler deconstruction of the dataset. Unfortunately, this could not be easily automated, so there was a manual data input aspect to this task.

Breakdown by Region

We can start by taking all the QLD crime reports and putting the youth offences into a single table for easier analysis – then plotting that table. This reveals the categories recorded are roughly the same – except the 2019-2020 report appears to aggregate FNQ and the North Coast into the Northern District.  As we want our data to be consistent, we’ll make that same aggregation across 2021 and 2022[i]. This is for youth offenders age 10-17[4] as per the recorded and publicly available data.

This is fairly obvious. There is dramatically more crime in North Queensland, and it’s increasing whilst elsewhere the total offence counts are roughly static. I’ve put the subgraphs in Appendix 2 for a more detailed breakdown. For now – let’s look at the breakdown of the Northern Queensland category, which appears to be an outlier

This is a little subtler. The trick to this data is to look at the Y axis and the totals (bottom row), as some offences are much more frequent than others The majority of the growth in crime in North Queensland is property crime. Particularly, a large portion of these come from two fairly serious offences – Unlawful Use of Motor Vehicle (UUMV – from 1000 to over 2500) and Unlawful Entry (apx 3000 to 4500), with another 1000 or so coming from theft. The highlighted section in the plots above are the total offences in the region, and the total property offences (which constitutes the majority of the growth in offences.)

With one more statistic, we can tie this together – the number of unique offenders. This will suggest whether there’s a wave of new criminals in this sector, or a worsening of existing offenders.

Once more there’s a problem – in the grouping of offenders. In the QLD reports they are counted in 5 year brackets – so 10-14 and 15-19. This is obviously less than ideal – as 18 and 19 year olds are not juveniles.

A complicating factor can be found in that not all ABS reports are the same. If we skip forward a year (22-23 ABS – a year ahead of the QLD crime reports), we can see in the table below that 14-17 year olds constitute the overwhelming majority of youth crime. Unfortunately, the data made public is a little different each year, and we can’t take this back further

Male + Female Unique Offenders (taken from table 24 2022-2023 Youth Offenders Statistics ABS)
Age2021-20222022-2023
10-11316324
12-1320631850
14-1784968134

So how do we deal with this issue – given that we either have to use a less than ideal bracket (15-20), or only a few years statistic (better brackets but only for a couple of years). We can start by plotting out the 10-14 and 15-19. If it’s really obvious, it makes our job easy. First we’re going to plot both brackets (from the Queensland governments statistics), then the brackets individually.

Once more, it’s incredibly obvious the problem isn’t an increase in offenders, but an increase in offences. Furthermore, with our data from earlier, we can localize this predominantly to North Queensland.

Now we can pair these up with some of the findings in the report that can’t necessarily be replicated on the publicly available data. We will take the following 2 graphs from the 2021-2022 Crime Report[5]

The first plot shows that there has been a sharp increase in Unlawful Entry, Theft, and UUMV offences in the last decade, which was also reflected over the 2019-2022 window. Personally, I find this graph totally unintuitive, and cannot use it to make any further findings beyond a general increase.

The second plot reflect that the same crimes are growing as our short term statistics, and that the offender profile is becoming increasingly indigenous, with the number of indigenous offenders increasing by about 50% over 10 years (from about 13500 to 20000).

Conclusion

For the years 2019-2022 it appears that the youth justice issues are localized in North Queensland, but are so severe that they arguably taint the statistics across the whole state. It also appears to be a small subset of repeat offenders – as opposed to a general increase in the offender population.

As such, there is merit to the position ‘it’s a beat up’ outside of North Queensland, and inside of it, it’s reasonable to say there is a crisis.

Appendix 1 – A 2 minute primer on statistics and data analysis

The presentation of data changes its interpretation, focus can be drawn to and away from points, and graphs can be made intuitive or used to obfuscate. There’s no such thing as ‘impartial’ data, one can collect statistics, but there’s partiality in what is collected, how much is collected, who collects it and the format its presented in[ii], be it a table, plot, statistic or otherwise

The job of the statistician is not only to provide information, but to draw the readers eye to important points, without presenting it in an unfair way. For example, if you had a simple graph that was a line, that represented annual crime numbers, it may be prudent to label when covid happened on that line – as the data will otherwise abruptly change without good reason. If you were to label however, “Jane won a spelling bee”, that would be misleading, and would infer a connection between winning spelling bee’s and crime.

Data often gets murky when privacy is involved, as aggregating data by definition destroys some data points. There’s always a tradeoff when reducing data down, less data means weaker analysis, but it can also serve important purposes. For example, offenders aren’t ‘named and shamed’ in public statistical records, which means I can’t go looking up if Bill is an offender or worse – if there’s a generational trend of offending in Bill’s family. That information is removed from what the public can access. What is kept, aggregated and removed is a matter of opinion and degree, but loosely, the more that is aggregated and removed, the less useful the dataset becomes as a whole.

Finally – the statistician has to determine what events are related and what are not – such as in the spelling bee example before. This is not an exact science – there are measures and tools, but they can be fooled. For example if someone flips a coin 1024 times, they will (on average) flip heads 10 times in a row once – which they then can report as an ‘anomaly’ if they don’t report the other 1023 tosses (this is called P-Hacking). Things will happen due to random chance, and it’s the statisticians job to ‘siphon’ out the events that are definitely chance, maybe chance, and probably not chance, and present the useful ones to the reader. There is some ‘art’ to the science so to speak.

APPENDIX 2


Appendix 3

[1] Australian Financial Review, Queensland’s youth crime and no punishment crisis is no beat-up, Amanda Stoker, 8/2/2024, https://www.afr.com/politics/queensland-s-youth-crime-and-no-punishment-crisis-is-no-beat-up-20240205-p5f2cz#:~:text=Break%2Dins%20and%20thefts%20are,ins%20that%20occurred%20in%20Queensland.

[2] Brisbane Times, Arrests won’t stop them: How police plan to crack down on youth crime in 2024 , Cloe Read, 8/1/2024, https://www.brisbanetimes.com.au/national/queensland/arrests-won-t-stop-them-how-police-plan-to-crack-down-on-youth-crime-in-2024-20240108-p5evtw.html

[3] QUEENSLAND TREASURY Crime report, Queensland, 2019–22, Queensland Government Statisticians Office, https://www.qgso.qld.gov.au/statistics/theme/crime-justice/crime-justice-statistics/reported-crime

[4] It would appear this is up to 18, so 17 is inclusive

[5] QUEENSLAND TREASURY Crime report, Queensland, 2021–22, Queensland Government Statisticians Office,


[i] This is a little misleading, as arguably North Coast should be separated from northern, however it had to be grouped somewhere, and it seemed it had to either go to central, or Northern_Updated. Given that central has remained a static category through all the datasets, it was chosen to be fused into the blocked northern category

[ii] This is a very limited subset. Data can be biased at essentially every step of the process, both deliberately and by accident.

Leave a Comment

Your email address will not be published. Required fields are marked *