School districts and the companies that work with them need to put a greater emphasis on ensuring that the massive amounts of student data collected every year are cleared of identifying information before records are shared or released to outside organizations.
While many districts and vendors are contemplating how to keep student information confidential, the 鈥渄e-identification鈥 of sensitive student data is not always technically simple to do, and a lack of resources means some districts are ill-equipped for the task, said Jules Polonetsky, the executive director of the Future of Privacy Forum.
鈥淒e-identification isn鈥檛 easy,鈥 Polonetsky said. 鈥淚t鈥檚 really hard when you鈥檙e dealing with an open system like schools, where parents, volunteers, and teachers need access to student data for collaboration, but at the same time we want the best standards of privacy control.鈥
To shed light on the legal and technical gray area involving the sharing and use of student data, the forum released a paper this month on the de-identification of sensitive student information.
Ideally, that process involves purging student records of any information directly linked to an individual student, as well as removing or obscuring any indirect information that could allow others to figure out who a student is, before the records are shared with a third party.
But that is easier said than done in an era in which students generate significant amounts of digital information, data brokers use public and private sources to amass extensive profiles of individuals, and researchers and vendors use increasingly advanced statistical and mathematical techniques in the course of everyday business.
Room for Interpretation
Personally identifiable information generally includes a student鈥檚 name and address, the names of family members, and such individual identifiers as a Social Security number or student ID. Examples of indirect identifiers might include a student鈥檚 date and place of birth, race, religion, weight, financial information, and mother鈥檚 maiden name.
鈥淒e-identification鈥 of student data is the process of purging or altering identifiable information from student records before it is shared with organizations outside the school district. Personally identifiable information generally includes a student鈥檚 name and address, the names of family members, and personal identifiers, such as Social Security number or student ID. Indirect identifiers might include the student鈥檚 date and place of birth, race, religion, weight, financial information, or mother鈥檚 maiden name.
Three key techniques for de-identifying student data include:
鈥 Blurring: Reducing the precision of disclosed data to minimize individual identification. This may involve grouping data into broader categories so unique cases are not highlighted. For example, student information might be placed in a 鈥渕inority鈥 category rather than individual categories, such as African-American or Hispanic.
鈥 Perturbation: Making small changes to data to prevent individual identification from unique or rare population groups. Some data can be swapped among individuals for analyses in which certain factors don鈥檛 matter. For instance, if birthplace is irrelevant, one student鈥檚 birthplace can be switched with another鈥檚.
鈥 Suppression: Removing some forms of data, such as race or place of birth, altogether to prevent individual identification, particularly within small groups.
Source: Future of Privacy Forum
At the federal level, the handling of such information is governed primarily by the federal Family Educational Rights and Privacy Act. FERPA, as the law is commonly known, 鈥減rohibits the disclosure of education records containing personally identifiable student data without parent or eligible student consent,鈥 according to the forum鈥檚 paper.
But that leaves open considerable room for interpretation鈥攁nd disagreement.
The forum, which is closely aligned with industry and is the prime mover behind a voluntary pledge on protection of student-data privacy that more than 160 companies have signed, believes that 鈥渁ppropriately de-identified鈥 student information is not covered by FERPA.
鈥淧roperly de-identified student data thus may be shared without limitation under FERPA (although other federal and state privacy laws may apply),鈥 the group writes. 鈥淔urthermore, 鈥榙e-identified鈥 information from education records is not subject to any destruction requirements because, by definition, it is not 鈥榩ersonally identifiable information.鈥 鈥
But Fordham University law professor and privacy expert Joel Reidenberg, also an academic adviser to the forum, pointed out what he views as problems with that approach and the ways in which it highlights 鈥渉oles in FERPA鈥檚 scope of coverage.鈥
Technical and statistical advances have made it 鈥渆asier and easier鈥 to take information that has ostensibly been 鈥渄e-identified鈥 and link it back to individual students, he said.
In addition, Reidenberg said, 鈥渃ustomized profiles鈥 of individual students constructed on the basis of their interactions with technology may not include a student鈥檚 name or address, but are still being used to make critical decisions about what and how students are taught.
Most troubling, he maintained, is that neither parents nor advocacy groups appear to have standing to challenge the ways in which FERPA is being interpreted, applied, and regulated by the U.S. Department of Education. When the Electronic Privacy Information Center tried to sue the department over controversial regulations it issued in 2011, the courts dismissed the case, saying the organization lacked legal standing.
Some of that situation could be poised to change. In recent months, bills have been introduced in Congress that would either rewrite FERPA or create entirely new federal privacy laws aimed at protecting student information.
Setting District Standards
In the meantime, school districts are left to grapple with the issue.
Bob Moore, an education technology consultant who leads privacy initiatives for the Consortium for School Networking, said there鈥檚 a wide variety of understanding and action on de-identification of student data and other privacy issues among districts.
鈥淪ome are more sophisticated,鈥 he said. 鈥淥thers aren鈥檛 sure what their role is in doing this or how important it is.鈥
The 52,500-student Howard County, Md., district has taken the issue seriously. It created the position of coordinator of data privacy, and it strictly controls how information is released.
Those controls, said Teddy Hartman, the data-privacy coordinator, include suppressing data on groups containing fewer than 10 students; encouraging some vendors to use school or classroom accounts instead of individual student accounts when possible; minimizing the amount of student data vendors have access to in order to make their digital tools work; and making vendors sign data-privacy contracts that are stricter than the forum鈥檚 privacy-protection pledge.
The district uses FERPA as the 鈥渇oundation, not the ceiling,鈥 Hartman said. 鈥淚t鈥檚 a baseline. We certainly strive to be more protective around data.鈥
In addition, he said, any new digital tools鈥攊ncluding programs, apps, and software being considered for use鈥攎ust go through a stringent process that includes privacy concerns from the start.
鈥淧rivacy has to be thought of upfront as you鈥檙e designing programs, not as an afterthought,鈥 he said.
Elsewhere, districts are struggling with how to keep student data safe.
A detailed Louisiana privacy law that went into effect during the last school year bars schools from collecting more than two pieces of personally identifiable information about students without parental consent, among other requirements. Despite guidance from state education officials, districts there have struggled to comply with the law.
But Sheryl Abshire, the chief technology officer for the 32,600-student Calcasieu Parish schools in Lake Charles, La., said the new law is pushing districts to confront the issues.
Among other strategies, Abshire said, she had to make every vendor working with her district sign a contract addendum with extensive requirements on student-data privacy. She said most vendors she鈥檚 worked with seem fairly sophisticated about de-identifying student data and addressing other privacy issues.
Because of the state law, Louisiana districts may be ahead of others in the country, Abshire said.
But she also said that fears about possible misuse of student data shouldn鈥檛 prevent districts from using technology extensively to improve teaching and learning.
鈥淲e must be responsible around data, but also responsible around student learning,鈥 she said. 鈥淲e shortchange students and our community if we step back and say, 鈥楾his is too complicated, so we鈥檙e not going to do it.鈥欌夆