Investigators' Blog
Students thrilled with summer internship experiences
Students who’ve taken part in previous summer internship programmes run by Te Pūnaha Matatini have expressed a high level of satisfaction with their experiences. Indeed, the 10-week paid internship programme provides an excellent opportunity for students to hone their data analytics skills while working for organisations in a real-world setting.
A total of 21 undergraduate and postgraduate university students from around New Zealand were selected for our 2017-18 programme. Divided into teams, the interns were placed on a wide range of projects working for various organisations, including Iwi, government and private firms based in either Auckland or Wellington.
There were some exciting new opportunities. One team, for instance, were placed on a project with Dragonfly Data Science and Te Hiku Media based in Wellington. Their internship involved work related to Te Hiku’s Kōrero Māori project, developing language tools that will enable speech recognition and natural language processing of te reo Māori. This requires the collection of more than 100,000 sentences and 250 hours of Māori language corpus. Once complete, it aims to provide these language tools to the Māori ICT industry.
Interns share their thoughts and details of their work
One of the student interns on this project was William Asiata, a BSc Mathematics graduate from the University of Canterbury and a current Master of Information Technology student at the University of Auckland.
“As a result of the internship I was able to generate a corpus of all te reo Māori spoken in Parliament which will be included in the greater corpus used to train the digital natural language processor language model,” said William. “As an interesting by-product we also produced some statistics about the historical usage of te reo in Parliament. I had the opportunity to learn and practice the Python and R programming languages and exercise data processing skills.
“I believe that it was a great opportunity for an inexperienced student to sharpen one’s skill set, to clarify future career goals, and to gain direct insight into the ICT and data science industries through practical work experience on meaningful, high-impact projects and the chance to learn directly from working professionals,” he added.
Check out our latest guest blog! This one is by maths grad William Asiata about his internship with @dflydsci, @TeHiku and @PunahaMatatini, working on the #KōreroMāori project & analyzing the rapid recent growth in the use of te reo by NZ parliamentarians. https://t.co/oxI0kuoPMp pic.twitter.com/POhuQ6Q9fQ
— Te Pūnaha Matatini (@PunahaMatatini) March 1, 2018
Another team worked on a project supporting research by Kate Hannah, Te Pūnaha Matatini’s Executive Manager, into the historical representation of women in science.
Emma Vitz, a statistics and psychology graduate from Victoria University of Wellington assigned to this project, researched an algorithm that classifies people by gender according to their first name, and blogged about the ethical pitfalls of such an approach. Emma also began research into networks underlying science collaboration in New Zealand. “I particularly enjoyed using both R and Python in the internship, and collaborating with researchers and other interns from Te Pūnaha Matatini,” said Emma.
Comp programs that learn from past experience/data are all in vogue. However, as data science consultant @EmmaVitz explains, such machine learning can reinforce racism when crunching population-based numbers. Check out her timely & thought-provoking piece: https://t.co/IKYI6juu7X pic.twitter.com/c6kEe8wVwI
— Te Pūnaha Matatini (@PunahaMatatini) February 1, 2018
Also on the team was Beth Rust, a BA (Hons) history graduate from Victoria University of Wellington, who conducted a literature review of the background and achievements of women in science.
“Women are everywhere in science,” said Beth. I noticed a few trends: a lot of early women scientists tended to be in botany – then later women dominated home science – now they are everywhere. I’ve also learnt a lot these past ten weeks, not just in terms of the history of science but also in a more general sense,” she added. “I’m very grateful for the experience and everything it’s taught me.”
It was a privilege hosting history grad Beth Rust over the summer. Check out her blog – an in-depth investigation into the extensive contributions of Māori & Pacific women in science over the years & the importance of their representation within the field. https://t.co/op9fZSXnSh pic.twitter.com/9rv31a86iz
— Te Pūnaha Matatini (@PunahaMatatini) February 11, 2018
Te Pūnaha Matatini Whanau member Stephen Merry, who is pursuing a PhD in mathematics at the University of Canterbury, also took part in the internship programme working with the Social Investment Agency in Wellington.
“I worked on two projects,” said Stephen. “The first investigated the scope of data held inside and outside of the Integrated Data Infrastructure, and the second examined how people’s use of health services is affected by the services’ accessibility. This internship gave me the opportunity to work in a different environment, and I felt a genuine sense of purpose completing the projects,” he added. “My colleagues in the Social Investment Agency were enormously helpful and understanding throughout, and the experience overall is something I would recommend to anyone interested.”
Following the programme, interns were invited to blog about their work for the Te Pūnaha Matatini website and these articles resulted in very positive feedback on Twitter – with even some New Zealand parliamentarians chiming in!
We love hearing about this stuff! 🤓One of the lesser-known perks of hosting a large publicly-available text corpus is seeing the cool projects that come out of it. (For those playing along at home, you can find #Hansard reports on our website: https://t.co/h8GdImYvud) #nzpol https://t.co/s2RaMq0hrN
— NZ Parliament (@NZParliament) March 8, 2018
Project to boost scientist mātauranga capability
A Te Pūnaha Matatini research project that aims to improve the way in which scientists connect and work with Māori has been awarded $100,000 in funding by New Zealand’s Ministry of Business, Innovation & Employment (MBIE).
The project, part of MBIE’s Te Pūnaha Hihiko: Vision Mātauranga Capability Fund, will be led by Dr Tara McAllister (pictured above), an environmental scientist with the University of Auckland, in collaboration with ecologist Dr Cate Macinnis-Ng and earth systems scientist Dr Daniel Hikuroa, Principal Investigators with Te Pūnaha Matatini at the University of Auckland. Importantly, the project team will partner with Mahaanui Kurataiao Limited, an environmental and resource management advisory firm based in Canterbury.
While there are some excellent examples of scientists engaging well with Māori communities, there are also instances when connecting has been a struggle.
“We want to look at how we make those interactions more successful, more productive, and more workable for everybody involved,” Dr Macinnis-Ng says.
“So we are going to co-develop a project with an Iwi group, where we’ll look at what their science needs are, and work out who in our field can deliver those things. By co-developing the project, it’s all about what the needs are of that group, rather than imposing what scientists want to do.”
The project will be conducted in a reflective way so the project team can understand what works best for the different groups involved. It will also develop te reo science materials appropriate for school curricula.
“We’ll be developing some teaching materials for kura kaupapa to make science more accessible to everyone,” says Dr Macinnis-Ng.
The project will be very important to Te Pūnaha Matatini’s wider research programme, says Shaun Hendy, the Centre’s Director and Professor of Physics at the University of Auckland.
“Building close engagement with Māori communities and learning about the mātauranga of complex systems is a wonderful opportunity for us,” he says.
“Not only will this project be essential to us in meeting our research goals, it will also provide social, economic, and environmental benefits to Aotearoa New Zealand.”
Testing large-scale predator control in Hawke’s Bay
New Zealand has an excellent record of conserving its native flora and fauna through pest control measures, especially in large uninhabited areas. Predator Free 2050 is a bold initiative that aims to rid the country of its most damaging invasive predators. However, to completely eliminate such predators from our shores, new and ambitious approaches are needed.
Implementing effective predator control over large areas
New Zealand’s unique and diverse native species of flora and fauna are extremely vulnerable to invasive mammals. Our often-publicised successes in conserving the country’s biodiversity by managing pests has mainly been restricted to large uninhabited areas. Meanwhile, large tracts of land owned by private individuals remain relatively unprotected.
When it comes to land management decisions such as pest control actions, careful negotiations are required with a wide range of stakeholders with differing views – from cat-lovers to rabbit-haters – so that agreements can be reached.
Experience has shown there are minimum thresholds for landholder participation in predator control measures for them to be successful. In practice, coordinated community efforts are required so that pest reinvasion from a few untreated properties does not compromise pest control achieved by others.
Another crucial element is biological connectivity between properties – the establishment of ‘safe passage’ corridors crossing landowner boundaries greatly assists in the dispersal of native species between fragments of suitable habitat. Large-scale pest control is therefore a spatial issue with social, environmental, and economic components.
The spread model is still being developed to provide more functionality for managers. In particular, we are investigating the ways in which landholders influence one another, how agencies influence landholders, and the presence of key influential landholders who might help catalyze actions are the current focus of research. Ultimately, the aim of the model is to improve strategic planning for mammal control at regional scales. Also, this model serves as a template for future dynamic maps of other mammal species.
Large-scale Cape to City research project in Hawke’s Bay
Te Pūnaha Matatini investigators Audrey Lustig, Mike Plank and Alex James, from the University of Canterbury, are involved in a large-scale predator control initiative covering 26,000 hectares of agricultural land in Hawke’s Bay, part of a wide range of research activities referred to as the Cape to City research project by the Hawke’s Bay City Council.
“This is just a start for a much more ambitious project that proposes a vision to eliminate invasive predators from the entire country,” says Audrey. “In this work, we develop a generic modeling approach as a planning tool for predicting the abundance and the likely persistence of four New Zealand top mammalian predators in the light of potential changes in management effort across human-dominated landscape.”
The first part of the project aims to generate a computer model for predicting the distribution and abundance of mammalian species across the landscape, the ways in which animals move from their natal sites, and how their distributions and abundance are affected by control interventions.
Such modelling can help inform managers on the likelihood of success of a specific pest control action (assuming every landholder participates in the control action). It also allows exploration of some of the mechanisms by which mammal populations might recover after control operations.
Importance of multi-stakeholder engagement
The work builds on a pre-existing knowledge base and data acquired by the Hawke’s Bay Regional Council, Department of Conservation, Manaaki Whenua and the Biological Heritage Challenge to bring about practical improvements in mammalian pest management in New Zealand.
“Such inter-organisational joint effort is common in New Zealand, but to me, what was critical was to bring a more practical insight into my research,” says Audrey. “In particular, the provision of direct feedback from decision-makers forms an integral part of the learning process and enriches my research experiences and outcomes, while providing useful information to the Hawke’s Bay Regional Council.”
For further details about this project, please contact us today.
New report shows mothers take pay cut to have a baby
A new report co-authored by Dr Isabelle Sin, Te Pūnaha Matatini Principal Investigator from Motu Economic and Public Policy Research (pictured), has revealed that mothers experience an average 4.4% wage decrease after having a baby.
The report’s findings made the front page of the New Zealand Herald print edition, with commentary from Associate Professor Siouxsie Wiles – also a Te Pūnaha Matatini Principal Investigator – and her husband and mathematician Professor Steven Galbraith, both from the University of Auckland. Check out the article here. Isabelle Sin was also interviewed on RNZ’s Nine To Noon – listen in here.
How do scientific articles and patents gather in importance?
Te Pūnaha Matatini researchers are collaborating across disciplines to develop novel tools that allow us to better understand trends underlying the citation of scientific papers and patents, a key indicator of their subsequent impact or importance.
PhD student Kyle Higham and his supervisors Ulrich Zuelicke (Uli) and Michele Governale from Victoria University of Wellington, and innovation economist Adam Jaffe from Motu Economic and Public Policy Research, have been researching how patents and scientific articles accumulate citations. Mapping the observed dynamics to a well-known network model, they were able to improve on previous studies by controlling for ‘citation inflation’ – an effect caused by the ever-increasing rate at which patents or articles are produced by inventors and researchers.
“As a result, we were able to reliably extract crucial network-model parameters and obtained extremely good agreement between data and model predictions for citation distributions,” says Uli. “Our work has proved to be a useful basis for gaining a deeper understanding of citation dynamics and is being utilised by us and others in the field to design improved network-model descriptions.”
Study suggests current rate of innovation faster than ever
The “icing on the cake”, says Uli, is that their study considered citation dynamics within specialised technology sectors for patents and individual physics research fields for articles.
“We were able to identify faster-moving technologies and research fields based on their faster rate of obsolescence exhibited in the citation dynamics.”
“Interestingly, we also found evidence for obsolescence times to have become shorter for physics articles published in 2000 compared with older ones from 1990. This indicates a general trend for the research frontier to move faster now than in the past, which is an interesting finding whose social origin deserves further exploration.”
Research helps to inform science and innovation policies
Uli explains there are good reasons to study citation dynamics.
“Research on citation dynamics can provide tools with which to inform rational science and innovation policies. Such research also underpins the design of meaningful and robust informetric impact measures.”
“To us, citation data provide a fingerprint or reflection of knowledge generation as a social endeavour. Citations could be, or are being, mined to understand [for example] geographical and social patterns of knowledge diffusion through communities of inventors and academics, as well as historical trends and drivers for knowledge generation and consumption.”
Keen to learn more about this project?
If you’re interested in finding out more about this project, please refer to the team’s most recent study findings reported in Physical Review E and Journal of Informetrics.
Social network analytics to aid vulnerable kids
Te Pūnaha Matatini investigators Mike Plank, Alex James, Jeanette McLeod, and postdoc research fellow Daniel Lond, are using social network analysis to assess risk in vulnerable children in New Zealand.
Collaborating with our stakeholders in the government sector
Working with an extensive data set, the team is exploring how the Ministry for Social Development (MSD) can improve their measures of the risk of harm to vulnerable children, for use by front-line practitioners. Directly funded by MSD, the researchers aim to develop tools that can be used to protect at-risk children and improve their lives.
The project uses relationship data pertaining to children who have had contact with Child, Youth and Family (CYF) from 2005 to 2016, and includes all relationships observed by CYF staff in their work with that child and their family. CYF has since been succeeded by the Ministry for Children, Oranga Tamariki (MCOT).
Using network science to develop tools that can improve outcomes
Networks are constructed to map the relationships between different individuals within the database. By examining these networks we are identifying key relationship risk factors that lead to children being of high estimated concern.
Preliminary results suggest that this approach can provide insight to help social worker decision making. The tool can be used by CYF staff, in addition to their existing experience and protocols, to assist in making real-time assessments regarding in-depth investigation or intervention.
Please contact us today if you would like to find out more about this project.
Te Reo Māori in New Zealand Parliament
As one of two summer 2017-18 student interns for the Kōrero Māori project with Dragonfly Data Science, Te Hiku Media and Te Pūnaha Matatini, we were assigned to help collect corpus of te reo Māori text that would be used to train the written language model component of a te reo Māori computer natural language processing engine. When ready, the natural language processor will be used as the base for making software like Apple’s artificially intelligent ‘Siri’, that will be capable of understanding te reo Māori.
One text source in particular was identified that is publicly available online and known to contain te reo Māori – that is the New Zealand Parliamentary Debates as recorded in the Hansard reports.
The written record of Parliamentary Debates (Hansard) make up over 700 volumes of text that span from 1854 to the present day, and daily reports continue to be published online within a fews hours of each new thing spoken in Parliament.
Working through the Hansard
A variety of challenges were encountered while programming an algorithm that could successfully sort through the text in all the volumes, accounting for a variety of text structures, and detecting and extracting te reo Māori.
Hansard characteristics:
- Hansard volumes prior to 1867 are assembled from newspaper publications and the like – the Hansard reporters first began their work in 1867.
- Prior to volume 410 (1977), speeches were not always directly quoted and were often written in a narrative style. It is a possibility that at times te reo was spoken but only recorded as a narrative in english. From volume 410 onwards, all speeches are directly quoted.
- Prior to volume 483 (1987), the volumes are published using non-digital means. Digital text has been generated from optical character recognition of scans – OCR from the earlier volumes is not the best. From volume 483 (1987) onwards the debates are published using computer word processing software.
- In 1994 the Hansard reports begin to use macronised vowels for te reo Māori words.
- From volume 606 (2003) onwards, the daily Hansard reports are available online as HTML formatted web pages.
In the end, the programme extracts segments of speech that have a high percentage of Māori words. It also counts all the Māori, non-Māori and ambiguous (e.g. ‘he’, ‘to’, ‘a’) words that are spoken within each day of debates.
Across the 700+ volumes, the programme has sorted through over 420 million words to detect about 7400 speech segments that are at least 50% te reo and have a combined total of about 390,000 Māori words.
History of te reo in Parliament
Several interesting discoveries were made after examining the result and making a graph (see figure below):
- Up until the 1980s the proportion of te reo Māori speech in Parliament was barely anything – less than 0.1% for more than 130 years. However over the last 2-3 decades the growth trend in the percentage of te reo spoken in Parliament is very remarkable, even reaching as high as 2% in a year.
- We found that Māori words make up about 0.2-0.4% of what people say in Parliament on average if they aren’t speaking in te reo Māori – most probably common words like names.
- A cluster of te reo speeches around the 1940s.
- Several MP speeches that include other Polynesian languages are counted to contain about 50% – 70% “Māori” words – this is due to similarity between languages and alphabets.
Interpretation of the growth trend
Viewing Parliament and the New Zealand House of Representatives in the context of an institution that endeavours to represent the whole of Aotearoa New Zealand, the kinds of social interactions that occur within Parliament can also be interpreted as a general indicator, as an approximation, and as the emergent result of the many kinds of cultural interactions and social dynamics that are happening on the ground across broader New Zealand society as a whole. In this sense, the amount of te reo spoken in Parliament, or any language for that matter, reflects the current position that language has in society. The growth in te reo Māori used in Parliament appears to parallel the time period from when Te Kohanga Reo and Te Reo Māori revitalisation movement began, as well as from the time when the process of settling Tiriti grievances began.
What next?
Over the summer we interns managed to aggregate several thousand te reo sentences combined, including from sources such as the historical Māori newspapers. However, over 100,000 sentences are required to train a good language model, so there is still a lot more corpus gathering to be done.
The program scripted for the Hansard debates can be run again and again as new debates are published to continue growing the corpus of te reo Māori. The script can also be adapted and reworked to sort through other text sources that consist of paragraphs and sentences, particularly bilingual text.
In addition, with a little more work on this particular code we can start to keep account of:
- The percentage of Māori spoken by each Member of Parliament over time
- The percentage of Māori spoken by each Party over time
- Count other Pacific/Polynesian languages when spoken in Parliament
Closing thoughts
The sudden upswing in te reo in Parliament in the last 20 – 30 years is astounding. From practically 0 to 1-2% in a couple of decades, imagine what it could look like in years to come:
- When the percentage of te reo spoken in Parliament begins to match the size of the Māori population (~15%).
- When the percentage of te reo spoken in Parliament approaches 50%, and the nation is almost 100% Māori bilingual.
No doubt, machines that have learnt to kōrero Māori will play an important part in such developments as we continue the journey onward into the technological future. Performing this mahi as a tauira intern for the Kōrero Māori project has been a great learning experience. I have been able to learn from professionals and sharpen my programming and data processing skills all for this deeply meaningful kaupapa with compelling implications for the digital future of languages indigenous to Te Moana-nui-a-Kiwa, and I am very humbled to have had the opportunity to contribute to its development.
Author
William Asiata is a BSc Mathematics graduate from the University of Canterbury and a current Master of Information Technology student at the University of Auckland. William is passionate about the development and application of social choice algorithms to the construction of social networking systems, and how this will impact the future of civic technologies. William is also interested in the social evolution of peoples across Oceania.
Ka pai Siouxsie!
The 2018 Kiwibank New Zealander of the Year Gala Awards were held in Auckland last night, with much-admired Kristine Bartlett, rest-home carer and pay equity campaigner in the healthcare sector, taking out the top honour.
Kristine’s fellow nominees included Mike King, well-known comedian turned mental health and suicide prevention campaigner, and our very own Siouxsie Wiles, award-winning microbiologist and science communicator, and principal investigator with Te Pūnaha Matatini.
Siouxsie’s research involves diseases that affect vulnerable children, in particular how to reduce the high rates of infectious diseases in New Zealand kids.
Professor Shaun Hendy, Director of Te Pūnaha Matatini, says it was was an incredible achievement for Siouxsie to be named as one of the three finalists for Kiwibank New Zealander of the Year.
“She is an inspiring role model for everyone at Te Pūnaha Matatini and we are all incredibly proud to work with her,” says Shaun. “Siouxsie is driven by her curiosity about the world and a desire to make a difference in people’s lives. She thinks very deeply about the ethics and impact of her work, and this is evident in the problems she chooses to study and the approach she takes to this study. She is also a passionate believer in making science transparent to the public, and strives to make it accessible to everyone. Siouxsie works hard to make it so that science is something for everyone, not just a privileged few.”
Congratulations Siouxsie for your magnificent mahi and for being a great Kiwi. Aroha nui!
If you haven’t already seen the official awards’ video tribute to Siouxsie, here it is:
Māori and Pacific Island women in science
Before I started working as a research assistant on the Hidden Networks project, the only woman from the history of New Zealand science I could name was Joan Wiffen, the “dinosaur lady” who discovered New Zealand’s first dinosaur fossils in Hawke’s Bay. She was a remarkable woman who contributed much to palaeontology here in New Zealand; she was also, incidentally, very white. I too am outwardly (that is, I pass as) very white. But as a mixed-race woman of Samoan descent, when I started this project I was very interested to learn about the contributions of non-Pākehā – chiefly, Māori and Pacific Island – women to science in Aotearoa. For the purposes of my research, I’ve taken “woman in science” to broadly mean a woman who has made a contribution to science in New Zealand, including both professional scientists with academic backgrounds and amateur scientists who have added to the pool of knowledge in their field, like Joan Wiffen.
The more I researched, the whiter the history of women in science in New Zealand came to look. Unsurprising really: according to Elizabeth McKinley, in 1998 just 1.5% of total employees at seven Crown Research Institutes in New Zealand identified as Māori women; there were none in management positions, and only two scientists. In ‘Finding Matilda’, Kate Hannah notes that “the historiography of science in New Zealand … tends to inadvertently reinforce [the] camouflage” of women. They are marginalized, but not absent: if you go looking, as I have, you’ll find a staggering number of women in New Zealand science from the 14th century to present-day. Yet from the beginnings of European presence in New Zealand, the overwhelming majority of these women were white. A feminist revisionist history of science aims not only to make science less male-centric (i.e. demonstrate, through promotion of women’s work both quantitatively and qualitatively, that science never has been just a man’s world) but also to make it less monochromatic (so to speak), which means celebrating the scientific achievements of brown women in New Zealand’s history, and showing that science never has been just a white world either.
In fact, the first women who made scientific contributions in Aotearoa were not Pākehā but Māori. I was delighted to learn of Whakaotirangi, who in the 1300s “was responsible for safeguarding the seed of the kūmara” as the Tainui Waka journeyed to Waikato. She was the wife of Hoturoa, the leader of the Tainui Waka migration from Hawaiki to Aotearoa, but also an important historical figure in her own right. In ‘Whakaotirangi: A Canoe Tradition’, Diane Gordon-Burns and Rāwiri Taonui explore how her importance has been diminished in post-European contact accounts of the Tainui migration. Tainui and Te Arawa traditions both speak of Whakaotirangi: she appears to be a noble and important ancestor in the history of both iwi. While she is most remembered for bringing kūmara to Waikato, she was also responsible for a number of other plants brought from Hawaiki. On arrival in Waikato, Whakaotirangi built gardens in which she experimented with growing and tending to a variety of plants, both for sustenance and medicinal purposes. She discovered how to make the kūmara, which had come from a much warmer climate, grow in the cooler land her people had settled. Her work was crucial for the establishment of the Tainui people: it provided them with a reliable food supply as they adjusted to life in a new land. She was also involved in commissioning, building and launching the Tainui canoe. Her profile on the Royal Society of New Zealand website, as part of their series 150 Women in 150 Words, credits her as “one of New Zealand’s first scientists”.
Around the middle of the 1400s, another important ancestor of the Waikato people appeared. Kahu (also known as Kahupeka, Kahupekapeka, Kahukeke, or Kahurere) was a Tainui woman who experimented with plants – such as harakeke, koromiko, kawakawa and rangiora – as medicinal remedies. She did so during her great journey: walking inland through the King Country while grieving the death of her husband (who in some accounts is Rakataura; in others Uenga). She gave names to different sites along her journey (such as Te Manga-Wāero-o-Te Aroaro-ō-Kahu – ‘the stream in which Kahu’s dogskin cloak was washed’) – these names tell the story of her journey and preserve the history of the land. At some point during her journey she was ill, which may have been why she sought out plants for their medicinal properties. Unfortunately there are many different versions of Kahupeka’s story, and in them there are few mentions of her medicinal experimentations with indigenous flora. In some versions Rakataura doesn’t die, and he and Kahu traverse the countryside naming places together, as explorers.
In Māori culture, practitioners or experts in any skill or art are known as tohunga. The Tohunga Suppression Act 1907 made tohunga status a punishable offence. The Act was repealed only in 1962, and so much of the knowledge surrounding this customary way of knowing has been suppressed – my search for tohunga wahine (female practitioners) who might count as women of science has not produced significant results. However, it is worth noting that the sources I accessed relied upon the written record. Other sources, such as Māori oral histories, may be much more fruitful.
The next Māori woman in science that I was able to find wasn’t born until the 19th century. Makereti Papakura (Margaret Pattison Thom; she also went by Maggie and was of Te Arawa and Tuhourangi iwi) was born to a Māori mother and an English father in the Bay of Plenty in 1873. She was raised by her mother’s aunt and uncle in Parekarangi, a rural area. She didn’t learn English until she was ten years old, speaking only Māori until her father took over her education. After her schooling, Papakura moved to Whakarewarewa, where she became an accomplished tourist guide. She gave herself the surname Papakura after a nearby geyser when a tourist she was guiding asked if she had a Māori surname. Clearly, the name stuck. In 1891 she married surveyor Francis Joseph Dennan; they had one child together before divorcing in 1900. In 1905 she wrote Guide to the hot lakes district. Papakura travelled to England in 1912, and married Richard Charles Staples-Browne. She had first met Staples-Brown when he was on a tour of New Zealand, and had reconnected with him while she was part of a Māori tour party in England. They divorced in 1924, but Papakura remained in England and in 1926 she enrolled at Oxford University, studying a BSc in anthropology. She died on April 16, 1930, only two weeks before her thesis, The old-time Māori – in which Papakura combined customary knowledge with scholarly conventions – was due to be examined. It was published posthumously, eight years later. Her thesis covers Māori social and familial structures, housing, weaponry and relationship with fire. She was meticulous in her writing, and wrote letters to her people in New Zealand during her drafting process, to ensure her account was as accurate as possible.
Bessie Te Wenerau Grace (1889-1944; Ngāti Tūwharetoa) was the first Māori woman university graduate, graduating from Canterbury University with a BA in 1926. She was the granddaughter of Ngāti Tūwharetoa chief Horonuku Te Heuheu. She then went on to receive an MA with first-class honours in modern languages from London University. In London she also became a nun, Sister Eudora. She worked as headmistress of St Michael’s School in Melbourne. In 1945, Dame Mira Szászy (1921-2001; Ngāti Kurī, Te Rarawa, and Te Aupōuri), a prominent Māori leader, became the first Māori woman to graduate with a degree from the University of Auckland. She went on to complete a postgraduate diploma in social sciences from the University of Hawaii and worked hard to improve the welfare of Māori women throughout her life. In 1949, Rina Winifred Moore (1923-1975; Ngati Kahungunu, Rangitane and Te Whanau-a-Apanui) graduated from the University of Otago with a Bachelor of Medicine and Bachelor of Surgery – and in so doing, became the first Māori woman doctor in New Zealand. In her career she worked to improve public perceptions of the mentally ill and was one of the first doctors in New Zealand to prescribe the contraceptive pill.
It has been harder for Māori and Pacific Islanders to enter scientific professions, as they are forced to combat social prejudices that expect them to fail – that tell them this is not where they belong. It has been harder for women to enter scientific professions because, again, they have to fight against the social biases that tell them ‘this is not your world’. Until the late 20th century, many women were expected to give up their careers when they married – motherhood and the domestic sphere became their full-time responsibilities. Some women chose to remain unmarried and childless in pursuit of scientific careers, while others stopped working when they married. Māori and Pacific women have to fight both gender and racial biases for their place in the world of science. This has been the case throughout the post-contact history of Aotearoa, and continues to be so.
Today, there are increasing numbers of Māori and Pacific Island women in science, with some of them working at the intersection of traditional knowledge and western science. Dr Ocean Mercier (Ngāti Porou) is a Senior Lecturer in Māori Science (the intersection of western science and mātauranga Māori) at Victoria University of Wellington. She has a PhD in Physics and was awarded the New Zealand Association of Scientists (NZAS) inaugural Lucy Cranwell Medal (previously the Science Communicators’ Medal) in 2017. Science researcher Hokimate Harwood (Ngāpuhi) combines western scientific and Māori customary knowledge in her research of the feathers in kahu huruhuru (feather cloaks). Her use of microscopy to identify the origins of feathers used in precious cloaks has been pioneering. She is a Bicultural Science Researcher at Te Papa. Her sister, Dr Matire Harwood (Ngāpuhi; PhD MBChB), is a Senior Lecturer at the University of Auckland Medical School and has done crucial research into indigenous healthcare throughout her career. Her efforts have been widely recognised, and in 2017 she was awarded a fellowship to the L’Oréal UNESCO For Women in Science programme.
Victoria University science educator Dr Hiria McRae (Te Arawa, Tūhoe, Ngāti Kahungunu) has created and developed a new educational model aimed at raising Māori students’ engagement in high schools. Through her research projects she has made important contributions to the field of Māori education.
Victoria University astrophysicist, science lecturer and research fellow Dr Pauline Harris (Rongomaiwahine and Ngāti Kahungunu), who has a PhD in astroparticle physics, is a key figure in the revitalisation and teaching of Māori astronomy. She is also involved in the search for extra-solar planets. Connected to Harris’s Māori astronomy programme is Pounamu Tipiwai Chambers, an undergraduate student at Victoria University who has employed Māori astronomical and navigational knowledge in undertaking waka voyages across the Pacific.
Another remarkable young woman, Alexia Hilbertidou (of Greek and Samoan descent), has founded GirlBoss New Zealand, an organisation aimed at the empowerment of young women in STEM studies after she felt alienated as the only girl in her year thirteen physics for engineering class. She was also part of NASA’s SOFIA project, making her the youngest person ever to be part of a NASA mission.
My blog post aims to contribute towards the unmasking of Māori and Pacific women’s contributions to science in both historical and contemporary landscapes. We are already seeing some important changes: many Māori women in science today combine customary and scientific knowledge to great success, a road paved by Makereti Papakura and her BSc thesis. However, Māori and Pacific women are still dramatically under-represented in fields of science, particularly at senior and management levels. It is therefore important that we keep up the momentum of positive change not only by looking forward but also by looking back: the successes of past figures provide an encouraging bevy of ‘shoulders to stand on’ for women in science today.
This post was written as part of my summer scholarship research on the Hidden Networks project, supervised by Rebecca Priestley and Kate Hannah.
Further reading
If you’re interested in learning more about the women I’ve mentioned, you might enjoy some of these sources:
- Colenso, William, ‘Contributions towards a better Knowledge of the Maori Race’, in Transactions and Proceedings of the Royal Society of New Zealand 1868-1961, Vol. 14, 1881, pp. 33-48. http://rsnz.natlib.govt.nz/volume/rsnz_14/rsnz_14_00_000690.html
- ‘Dr Ocean Mercier wins prestigious Science Communicator’s Medal.’ https://www.victoria.ac.nz/news/2017/11/dr-ocean-mercier-wins-prestigious-science-communicators-medal
- GirlBoss New Zealand. https://www.girlboss.nz/
- Hilbertidou, Alexia, ‘NASA SOFIA Experience’, U.S. Embasssy & Consulate in New Zealand. https://nz.usembassy.gov/alexia-hilbertidou-nasa-sofia-experience/
- ‘Hokimate Harwood – Identifying feathers’, Museum of New Zealand Te Papa Tongarewa. https://collections.tepapa.govt.nz/topic/3657
- Mack, Ben, ‘How Dr Matire Harwood is addressing inequities in healthcare for indigenous people’, Idealog, 3 November 2017. https://idealog.co.nz/etc/2017/11/how-dr-matire-harwood-addressing-inequities-healthcare-indigenous-people
- ‘Māori science education model developed’, Radio New Zealand, 28 August 2015. https://www.radionz.co.nz/news/te-manu-korihi/282635/maori-science-education-model-developed
- McKinley, Elizabeth. ‘Brown Bodies, White Coats: Postcolonialism, Māori women and science’, in Discourse: Studies in the Cultural Politics of Education 26 no. 4, 2005, pp. 481-496. http://www-tandfonline-com.helicon.vuw.ac.nz/doi/full/10.1080/01596300500319761?scroll=top&needAccess=true
- Morton, Jamie, ‘Royal Society tackling diversity issues’, New Zealand Herald, 26 October 2016. http://www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=11736157
- Morton, Jamie, ‘Q&A: NZ science’s own ‘Hidden Figures’, New Zealand Herald, 24 January 2017. http://www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=11787672
- Northcroft-Grant, June. ‘Papakura, Makereti’, Dictionary of New Zealand Biography, first published in 1996. https://teara.govt.nz/en/biographies/3p5/papakura-makereti
- ‘Researcher to teach traditional Māori astronomy’, Radio New Zealand, 17 June 2013. https://www.radionz.co.nz/news/te-manu-korihi/137868/researcher-to-teach-traditional-maori-astronomy
- Royal, Te Ahukaramū Charles, ‘Waikato tribes – Ancestors’, Te Ara – the Encyclopedia of New Zealand, http://www.TeAra.govt.nz/en/waikato-tribes/page-3
- Shaw, Aimee, ‘Meet Alexia Hilbertidou, the 18-year-old founder of GirlBoss and the youngest person to be involved with Nasa’s Sofia mission’, New Zealand Herald, 11 July 2017. http://www.nzherald.co.nz/business/news/article.cfm?c_id=3&objectid=11888757
- Tipiwai Chambers, Pounamu. ‘Te Ara Tauira’ in Salient, 1 May 2017. http://salient.org.nz/2017/05/te-ara-tauira-4/
Author
Beth Rust is a BA(Hons) history graduate from Victoria University of Wellington. For her Honours thesis she researched the writings of Christine de Pizan, a 15th-century humanist and early defender of womankind. This past three months she has been working as a research assistant on the project ‘Hidden Networks: hybrid approaches for the history of science’. Beth is just about to start a job in the public service, and she is very excited to take the skills she has learned from her summer research into her new role. She loved being a summer scholar.
How machine learning can perpetuate racism
I wrote this algorithm to classify people by gender, but one of the biggest things I learned was how machine learning can reinforce racism and perform poorly on ethnic minorities.
Machine learning – or programs that are able to learn from and improve on past experience and data – is often accused of reinforcing human biases such as racism and sexism. However, it can be a bit unclear how exactly this happens.
How does an automatic soap dispenser fail to recognize black people’s hands? How does image recognition software come to classify people in kitchens as women, regardless of their actual gender? How does artificial intelligence that seeks to predict criminal recidivism produce results that are consistently biased against black people?
This walk-through hopes to give you a bit of an insight into one example of racism in machine learning, and how this comes to be.
The algorithm will be used as part of research into gender equity in STEM fields in New Zealand. A lot of information about who works in certain research centres or who graduated from university is publicly available online (for example, here are university records from NZ between 1870 and 1961), but it doesn’t explicitly include their gender. While a person reading the information can usually guess their gender quite easily and with a high degree of accuracy, it’s obviously very impractical to read and classify thousands or hundreds of thousands of observations. This is where this algorithm hopes to simplify and speed up the process of identifying women in STEM fields.
Training and testing data: Selecting appropriate data
Getting good data for the training and test sets is a really important part of machine learning. Your model is only as good as the data you train and test it on, so getting this right is key.
The starting point of my dataset is the 100 most common names for boys and girls born in New Zealand in each year, going back to 1954. One major drawback of this dataset is that it only includes people born in New Zealand, not those that emigrated here. This means the dataset is almost exclusively made up of Anglo-Saxon names, and does not reflect New Zealand’s large Asian and Pacific populations.
🎉 New #data alert! 🎉
We’ve just updated our figures on a topic that’s always popular – #baby names 👶🏽
Charlotte and Oliver topped the 2017 charts, but we’re sure you’ll spot plenty of other familiar names. Know anyone with a name that made the top 50? pic.twitter.com/A1eHH4kGq5
— Figure.NZ (@FigureNZ) January 30, 2018
It also doesn’t include any Māori names, presumably because the Māori population isn’t large enough for these names to make the top 100 list. I’ve tried to remedy this by adding the top 20 Māori names for boys and girls from several years to the dataset. However, 91% of the training dataset is still made up of Anglo-Saxon names, while only 9% is made up of Māori names.
These biases in the training dataset mean that the model is likely to recognize the patterns that indicate gender in Anglo-Saxon names, while not picking up on patterns that indicate gender in the names of other cultures. The same biases in the testing dataset mean that the accuracy of the model probably only applies to Anglo-Saxon names, and that it may do much worse on names of other nationalities.
Selecting useful features for the algorithm
It’s important to consider what features would be most useful in predicting the desired classes. I started off by using the last letter of each name to predict gender. Most Anglo-Saxon names for men end with a consonant, while most Anglo-Saxon names for women end with a vowel.
There are also some pairs of letters that are more common for one gender than the other. For example, the last letter ‘n’ is indicative of a male name (e.g. Brian, Aidan, John), but the suffix ‘yn’ is indicative of a female name (eg. Robyn, Jasmyn). Because of this, using both the last letter of each name and the suffix as features results in higher accuracy than just using the final letter. This gave me an accuracy of about 73% on a testing dataset that includes both Anglo-Saxon and Māori names.
This overall accuracy is lower than it would have been on a testing dataset made up of only Anglo-Saxon names because these features don’t perform as well with names of other origins. In a New Zealand context, this causes the most problems with Māori names. Most Māori names end in vowels, regardless of gender (examples of male Māori names include Tane and Nikau, while female Māori names include Aroha and Kaia). This means this particular feature doesn’t do a very good job with names of Māori origin.
The same problem would likely apply to other ethnicities, too. For example, Japanese, Chinese, Vietnamese, Italian and Hispanic names all often end in vowels, regardless of gender.
Imbalanced classes and the problems they cause
Imbalanced classes, or classes that are very different in their size, can also create problems for machine learning algorithms. In this case, ethnicity is an imbalanced class that is likely to influence people’s names. In the 2013 census, 74% of New Zealanders identified as European, 15% as Māori, 12% as Asian and 7% as Pacific. (Note that Statistics New Zealand allows you to identify with more than one ethnicity, therefore these numbers don’t add up to 100%).
Imbalanced classes often result in high accuracy within the majority class (in this case, European) and low accuracy within the minority classes (Māori, Asian and Pacific). This particular algorithm has an overall accuracy of about 73%. The accuracy within Māori names is about 69%, while the accuracy within European names is 75%.
The class imbalances in the data explain why the overall accuracy may not be a very good way of assessing whether the algorithm is working well. As well as checking the accuracy within each subgroup, it can be a good idea to look at precision and recall for more information on where the algorithm is doing well and where it’s doing poorly.
Precision tells us how much of a classified group actually belongs to that group. In this case, for example, precision of female names is the percentage of names classified as female that are actually female. It is calculated by dividing the number of true positive (number of women classified as female) by all positives (number of women and men classified as female).
Recall is the percentage of a particular group that has been classified as belonging to that group. For example, recall of male names is the percentage of male names that have been classified as male. Recall is calculated by dividing the number of true positives (number of men classified as male) by the number of true positives and false negatives (number of men classified as female).
The tables below show the precision, recall and a couple of other metrics on how well the algorithm is doing. The differences between the overall table and the tables by ethnicity show that it’s likely that this algorithm is systematically worse with non Anglo-Saxon names, specifically Māori names in this instance.
Overall:
precision | recall | F1 score | support | |
F | 0.77 | 0.76 | 0.77 | 274 |
M | 0.71 | 0.72 | 0.72 | 226 |
avg/total | 0.74 | 0.74 | 0.4 | 500 |
For Māori names only:
precision | recall | F1 score | support | |
F | 0.75 | 0.88 | 0.81 | 17 |
M | 0.33 | 0.17 | 0.22 | 6 |
avg/total | 0.64 | 0.70 | 0.66 | 23 |
Here we can see that both precision and recall is very low for male Māori names. This means that only a small percentage of the names classified as being male actually are male (low precision) and an even smaller percentage of male Māori names have been classified as being male (low recall).
This is probably because most Māori names end in vowels, regardless of their gender. The algorithm does alright on female Māori names, because it has seen many instances of female names ending in vowels before. But it hasn’t seen many male names ending in vowels, so it fails to classify most of these names correctly.
For European names only:
precision | recall | F1 score | support | |
F | 0.82 | 0.72 | 0.77 | 140 |
M | 0.7 | 0.81 | 0.75 | 115 |
avg/total | 0.77 | 0.77 | 0.77 | 255 |
Because machine learning algorithms with imbalanced classes usually do worse in the smaller classes, they can further marginalise minority groups by routinely misclassifying them or failing to take into account patterns that are unique to the smaller group. In this example, this is likely to be the case with ethnic minorities.
It seems that this algorithm is likely to really only do a good job on Anglo-Saxon names. This limits the situations in which it would be appropriate to use it, and risks reinforcing Eurocentricity and a focus on whiteness.
This example shows how difficulties in getting hold of representative datasets, selecting features and unbalanced classes can cause algorithms to perform poorly on minority groups. These are only a couple of the many ways machine learning can contribute to the marginalisation of minorities, and it’s important to consider how this might happen in the particular algorithm you’re working on.
The consequences of bias in machine learning can range from the irritation of not being able to get soap out of an automatic dispenser, to the devastation of being given a longer prison sentence. As these algorithms become more and more ubiquitous, it is essential that we consider these consequences in the design and application of machine learning.
See this paper for a more detailed look at how imbalanced classes affect machine learning algorithms.
Author
Emma Vitz is a recent Statistics & Psychology graduate of Victoria University who is starting a new role at an actuarial consulting company in Auckland. Emma enjoys applying data science techniques to all kinds of problems, especially those involving people and the way they think.