Spanish Linguistic Variation in Digital Spaces

SLViDS research group's project on 'que'-drop and form-meaning-function pairing in language cognition, variation and change.

Join a team of dedicated linguists and programmers to understand how and why the Spanish language is changing. 

Project description

Join our Spanish Linguistic Variation in Digital Spaces (SLViDS) research group to explore how and why the Spanish language is changing over time.

Our current project focuses on the extent to which a structure like "Espero que estés bien" alternates with "Espero estés bien". In these examples, speakers can choose between keeping and dropping the complementizer "que", and their choices may reflect broader cross-linguistics tendencies that connect to predictability, verb frequency, verb type, variation, and change over time.

To explore these issues, our SLViDS group is comprised of two teams: the Programming & Coding Solutions team and the Qualitative Analysis team. The first team is responsible for extracting and organizing Spanish language data using Python, for using spaCy to do part-of-speech and syntactic dependency tagging, and for offering solutions to filter out data that does not contain the type of structure we want to study.

The Qualitative Analysis team reviews and analyzes the morphosyntactic structure of extracted Spanish data to categorize and tag individual instances of relevant clauses. They also communicate to the Programming & Coding Solutions team how the analysis workflow can be made more efficient by visually organizing the extracted language data.

Both teams are looking for new members! We already work with an energetic group of faculty and graduate students from the University at Buffalo, North Carolina State University, and la Universidad Isabel I in Burgos, Spain. People with Spanish skills, linguistics knowledge, and programming experience all have a role to play! 

Project outcome

Student team members will have the opportunity to present their work at academic conferences at UB and beyond. Those who participate in the project long-term will be involved in grant writing and co-authored article publications. 

Project details

Timing, eligibility and other details
Length of commitment Varies (short, medium, long or year-long). Depends on student interests, goals, and competing responsibilities
Start time Anytime
In-person, remote, or hybrid? Remote
Level of collaboration Large group collaboration (4+ students)
Benefits Possible academic credit and possible funding for student involvement
Who is eligible All undergraduate students with Spanish language  skills (morphology, semantics, syntax) and/or natural language processing with SpaCy, corpus linguistics, statistics for linguistics/social sciences (mixed-effects modeling, non-parametric methods), sociolinguistics hispanic linguistics

Core partners

Project mentor

Adrián Rodríguez Riccelli

Assistant Professor of Spanish

Romance Languages and Literatures

Phone: (716) 645-3820

Email: arriccel@buffalo.edu

Start the project

  1. Email the project mentor using the contact information above to express your interest and get approval to work on the project. (Here are helpful tips on how to contact a project mentor.)
  2. After you receive approval from the mentor to start this project, click the button to start the digital badge. (Learn more about ELN's digital badge options.) 

Preparation activities

Once you begin the digital badge series, you will have access to all the necessary activities and instructions. Your mentor has indicated they would like you to also complete the specific preparation activities below. Please reference this when you get to Step 2 of the Preparation Phase. 

Must have low-advanced Spanish proficiency or higher AND/OR Python coding/programing and/or natural language processing skills. Consult Dr. Rodríguez Riccelli and/or Dr. Balukas for any relevant readings. Attend some of our (bi-)weekly meetings to get an idea of our work flow, goals, methods, etc. 

Keywords

Romance Languages and Literatures, Computer Science, Spanish, Digital text, Natural Language Processing , Linguistics(corpus linguistics, sociolinguistics, hispanic linguistics, morphosyntax, semantics), programming/computer science