• Work
  • about

priyana patel

  • Work
  • about
 

Unheard Voices

Exploring the Complex Relations Between Indigenous Peoples and Settlers in American Colonialism

Brief

A Computational Textual Analysis of 18th & 19th-Century Native American Understandings of Identity

Client
DH199
Online Statistical Computing Reference (OSCR)

 

Duration
Winter 2020
10 weeks

 

Roles
Textual Analysis
Word Frequency & Collocation Analysis

 
Text Analysis (Webpage).png
 

Team: Eustina Kim, Michelle Lee, Priyana Patel, Vicki Truong

 

 

Project Overview

The Corpus

American State Papers (1789 to 1838)
A collection of legislative and executive documents from Congress. The analysis used documents categorized as “Indian Affairs.”

Treaty Council Notes (1784 to 1814)

The United States and Native Americans used peace treaties to foster mutual respect and discuss land acquisition.

 

Research Questions

How do Native leaders talk about being “Indian” and notions of difference?

What are the dominant themes in the corpus, and which documents are strongly correlated with the themes?

 

 

Text Preprocessing

Using the Natural Language Toolkit Library to Clean and Prepare the Documents

First, we tokenized the corpus, which breaks up a document into words known as tokens. We then converted these tokens to lowercase to ensure the analysis is not case-sensitive. We also omitted non-alphabetical tokens and removed stop words. Stop words are commonly used words that don’t add much significance to our research, such as “the,” “and,” “is,” etc.

Preprocessing.png
 

 

Word Frequency

Word Frequency Quantifies the Frequency and Occurrences of Particular Words Across a Corpus

Using the wordcloud function in Python, we can visualize the most commonly used words throughout the documents. Based on the 50 most frequently occurring terms within the corpus, I broke down the results by pronoun for “i,” “we,” and “you.”

WF & WC.png
 

 

Collocation Analysis

Collocation Analysis Is the Process of Examining Statistically Significant Word Pairings

I first computed the 15 most frequently occurring bigrams (two adjacent terms) and trigrams (three adjacent terms) to understand common themes between speaking parties. To better understand differences by speaker, I created an ngram filter for “i,” “we,” and “you.”

CA.png
 

 

Topic Modeling

Topic Modeling Is a Type of Statistical Modeling for Discovering Abstract Topics (Gensim & PyLDAvis)

To determine how many topics, we compared coherence scores, which measure the degree of semantic similarity between high-scoring words within the topic. We created our LDA model to obtain six groups of relevant terms and then labeled each group.

TM.png
 

 

Findings

Word Frequency
‘i’ was the most frequently occurring pronoun, and ‘you’ was the least.

Collocation Analysis
The highlighted examples indicate an effort to create distance, underscoring differences between the speaker and the person/people they address.

Topic Modeling
Topic 2 has the most significant proportion and the greatest number of documents most strongly related to it. Topic 5 stands out with relevant terms that are more community-based in nature.

Distribution.png
 

Bigrams

 

Intertopic Distance Map

 

 

Discussion

 
 

There’s an Ongoing Issue With History Being Presented From a Euro-American Perspective

Our research reveals the hierarchies of power within Native American communities in contrast to the American government. Native Americans communicated using familial language (father, brother) and unifying pronouns (we, us) to speak on behalf of their communities. The topic labels reveal a contested relationship between the Euro- and Native-Americans, especially over land and treaties. The textual analysis emphasizes the dynamics between these two groups of people in meetings, times of war, and land negotiations.

 
 
 

A Precursor for Gender Roles in Native American Communities

With a better understanding of the familial power dynamic in Native communities, we can take a more in-depth look at how gender defined communal roles. Men took the responsibility as decision-makers: they attended council meetings and acted as their tribe’s representative, making decisions about trade, land, and war. These roles support the male-dominated terms and significant themes within our findings. Future research can answer how power and control translate across duties and responsibilities within tribes.