Oren Etzioni

Oren Etzioni (born 1964) is Professor Emeritus of Computer Science at the University of Washington, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). Etzioni is a co-founder of Vercept, an AI startup, and founder and CEO of TrueMedia.org, a non-profit dedicated to fighting political deepfakes, which launched in April 2024. He is also the Founder and Technical Director of the AI2 Incubator and a venture partner at the Madrona Venture Group. == Early life and education == Etzioni is the son of Israeli-American intellectual Amitai Etzioni. He was the first student to major in computer science at Harvard University, where he earned a bachelor's degree in 1986. He earned a PhD from Carnegie Mellon University in January, 1991, supervised by Tom M. Mitchell. == University of Washington career == Etzioni joined the University of Washington faculty in 1991, immediately after receiving his PhD. He rose through the ranks to become the Washington Research Foundation Entrepreneurship Professor in Computer Science & Engineering. Etzioni's research has been focused on basic problems in the study of intelligence, machine reading, machine learning and web search. Past projects include Internet Softbots—the study of intelligent agents in the context of real-world software testbeds. In 2003, he started the KnowItAll project for acquiring massive amounts of information from the web. In 2005, he founded and became the director of the university's Turing Center. The center investigated problems in data mining, natural language processing, the Semantic Web and other web search topics. Etzioni coined the term machine reading and helped to create the first commercial comparison shopping agent. He has published over 200 technical papers, and his H-index exceeds 100. == Entrepreneurship == As a faculty member Etzioni was also an active entrepreneur, founding multiple companies and pioneering multiple technologies including MetaCrawler (bought by Infospace), Netbot (bought by Excite in 1997 for $35 million), and ClearForest (bought by Reuters). He founded Farecast, a travel metasearch and price prediction site, which was acquired by Microsoft in 2008 for $115 million. Before founding Farecast, he developed a program originally called Hamlet, that used algorithms to identify patterns in airfare data using data-mining techniques. He also co-founded Decide.com, a website to help consumers make buying decisions using previous price history and recommendations from other users. Decide.com was bought by eBay in September, 2013. Etzioni is also a venture partner at the Madrona Venture Group. He is founder and CEO of TrueMedia.org, a non-profit dedicated to fighting political deepfakes, which launched in April 2024. Etzioni is a co-founder of Vercept, an AI startup formed in 2025. == Founding CEO of AI2 == In September 2013 Etzioni was selected as the Founding CEO of the Allen Institute for Artificial Intelligence by philanthropist Paul G. Allen, and in January 2014 he took a leave of absence from the University of Washington to serve in that role. Etzioni's technical contributions continued at AI2; for example, in 2015, he helped to create the Semantic Scholar search engine. Under Etzioni’s leadership, AI2 grew from zero to over two hundred team members including notable researchers and engineers across several domains of AI. By 2021, its AI2 researchers had published near 700 papers in publications such as AAAI, ACL, CVPR, NeurIPS, and ICLR. Twenty-four of these papers had garnered special-recognition awards. AI2 also offered several key resources and tools to the AI community including the AllenNLP library, Semantic Scholar, and the conservation platforms EarthRanger and Skylight. Ed Lazowska, AI2 Board Member, has stated about Etzioni that he "took the collegial, collaborative culture that he absorbed in his 20+ years as a professor in UW's Allen School and mixed it with the singular focus that drives startups to create an elixir that AI2 folks have been drinking over the last eight years. The result is an exceptional organization of scientists, engineers, and entrepreneurs that's pursuing Paul Allen’s vision of ‘AI for the Common Good’ with extraordinary success.” == Popular press == In addition to his scientific publications, Etzioni has written commentary on AI for The New York Times, Wired, Nature, and other publications. After reading the idea in a book about AI by Brad Smith and Harry Shum, Etzioni has attempted to create an oath for AI practitioners. In 2018, he published what he called a "Hippocratic Oath for artificial intelligence practitioners" in TechCrunch. == Awards and recognition == In 1993, Etzioni received a National Young Investigator Award. In 2003, Etzioni was elected as AAAI Fellow. In 2005, Etzioni received an IJCAI Distinguished Paper Award for "A Probabilistic Model of Redundancy in Information Extraction". In 2007, he received the Robert S. Engelmore Memorial Award. In 2012 Etzioni was featured as GeekWire's "Geek of the Week". In 2013 Etzioni was voted "Geek of the Year" through GeekWire. In 2022, Etzioni received the 2012 ACL Test-of-Time Paper Award. In 2022, Etzioni, along with Ana-Maria Popescu and Henry Kautz, received the ACM Intelligent User Interfaces Most Impact Award for their 2003 paper, "Towards a Theory of Natural Language Interfaces to Databases". == Personal life == Etzioni has three children, and has said in interviews that family is his number one priority. He is married to Ivone Etzioni, and was previously married to Dr. Ruth Etzioni, a biostatistician at the Fred Hutchinson Cancer Center. Outside of his professional career, Etzioni has a wide range of personal interests. He has attended the Burning Man festival, which he described as a valuable way to step outside his comfort zone. His first computer was a TRS-80, and he has described his car’s GPS as his favorite gadget, joking that he has “no sense of direction.” == Selected publications == === Scholarly publications === Etzioni, Oren (July 1994). "A Softbot-based Interface to the Internet" (PDF). Communications of the ACM. Retrieved March 29, 2018. Etzioni, Oren (December 2008). "Open Information Extraction from the Web" (PDF). Communications of the ACM. Retrieved March 29, 2018. Zamir, Oren; Etzioni, Oren (1998). "Web document clustering". Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ACM. pp. 46–54. doi:10.1145/290941.290956. ISBN 978-1-58113-015-7. S2CID 244069. Zamir, Oren; Etzioni, Oren (May 1999). "Grouper: a dynamic clustering interface to Web search results". Computer Networks. 31 (11–16): 1361–1374. CiteSeerX 10.1.1.31.8216. doi:10.1016/S1389-1286(99)00054-7. S2CID 206134308. Popescu, Ana-Maria; Etzioni, Oren (2005). "Extracting product features and opinions from reviews". Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05. pp. 339–346. doi:10.3115/1220575.1220618. Etzioni, Oren; Cafarella, Michael; Downey, Doug; Popescu, Ana-Maria; Shaked, Tal; Sonderland, Stephen; Weld, Daniel; Yates, Alexander (June 2005). "Unsupervised named-entity extraction from the Web: An experimental study". Artificial Intelligence. 165 (1): 91–134. doi:10.1016/j.artint.2005.03.001. Downey, Doug; Etzioni, Oren; Sonderland, Stephen (July 2010). "Grouper: Analysis of a probabilistic model of redundancy in unsupervised information extraction". Artificial Intelligence. 174 (11): 726–748. CiteSeerX 10.1.1.174.2441. doi:10.1016/j.artint.2010.04.024. === Popular articles === Etzioni, Oren (August 4, 2011). "Web Search Needs a Shakeup" (PDF). Nature. Retrieved November 21, 2019. Etzioni, Oren (December 9, 2014). "AI Won't Exterminate Us – It Will Empower Us". Backchannel. Retrieved March 29, 2018. Etzioni, Oren (February 4, 2016). "To Keep AI Safe -- Use AI". Vox. Retrieved November 21, 2019. Etzioni, Oren (April 8, 2016). "Quora Session with Oren Etzioni". Quora. Retrieved March 29, 2018. Etzioni, Oren (June 15, 2016). "Deep Learning Isn't a Dangerous Magic Genie. It's Just Math". Wired. Retrieved March 29, 2018. Etzioni, Oren (September 20, 2016). "No, the Experts Don't Think Superintelligent AI is a Threat to Humanity". MIT Technology Review. Retrieved November 21, 2019. Etzioni, Oren (July 6, 2017). "Artificial intelligence: AI Zooms in on highly influential citations". Nature. Retrieved March 29, 2018. Etzioni, Oren (September 1, 2017). "How to Regulate Artificial Intelligence". The New York Times. Retrieved March 29, 2018. Etzioni, Oren (November 2, 2017). "Workers Displaced by Automation Should Try A New Job: Caregiver". Wired. Retrieved March 29, 2018. Etzioni, Oren (March 14, 2018). "A Hippocratic Oath for artificial intelligence practitioners". Tech Crunch. Retrieved March 29, 2018. Etzioni, Oren (March 7, 2018). "A 'Manhattan Project' for science research". The Hill. Retrieved November 21, 2019. Etzioni, Ore

Cross-language information retrieval

Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. The term "cross-language information retrieval" has many synonyms, of which the following are perhaps the most frequent: cross-lingual information retrieval, translingual information retrieval, multilingual information retrieval. The term "multilingual information retrieval" refers more generally both to technology for retrieval of multilingual collections and to technology which has been moved to handle material in one language to another. The term Multilingual Information Retrieval (MLIR) involves the study of systems that accept queries for information in various languages and return objects (text, and other media) of various languages, translated into the user's language. Cross-language information retrieval refers more specifically to the use case where users formulate their information need in one language and the system retrieves relevant documents in another. To do so, most CLIR systems use various translation techniques. CLIR techniques can be classified into different categories based on different translation resources: Dictionary-based CLIR techniques Parallel corpora based CLIR techniques Comparable corpora based CLIR techniques Machine translator based CLIR techniques CLIR systems have improved so much that the most accurate multi-lingual and cross-lingual adhoc information retrieval systems today are nearly as effective as monolingual systems. Other related information access tasks, such as media monitoring, information filtering and routing, sentiment analysis, and information extraction require more sophisticated models and typically more processing and analysis of the information items of interest. Much of that processing needs to be aware of the specifics of the target languages it is deployed in. Mostly, the various mechanisms of variation in human language pose coverage challenges for information retrieval systems: texts in a collection may treat a topic of interest but use terms or expressions which do not match the expression of information need given by the user. This can be true even in a mono-lingual case, but this is especially true in cross-lingual information retrieval, where users may know the target language only to some extent. The benefits of CLIR technology for users with poor to moderate competence in the target language has been found to be greater than for those who are fluent. Specific technologies in place for CLIR services include morphological analysis to handle inflection, decompounding or compound splitting to handle compound terms, and translations mechanisms to translate a query from one language to another. The first workshop on CLIR was held in Zürich during the SIGIR-96 conference. Workshops have been held yearly since 2000 at the meetings of the Cross Language Evaluation Forum (CLEF). Researchers also convene at the annual Text Retrieval Conference (TREC) to discuss their findings regarding different systems and methods of information retrieval, and the conference has served as a point of reference for the CLIR subfield. Early CLIR experiments were conducted at TREC-6, held at the National Institute of Standards and Technology (NIST) on November 19–21, 1997. Google Search had a cross-language search feature that was removed in 2013.

Digital heritage

The Charter on the Preservation of Digital Heritage of UNESCO defines digital heritage as embracing "cultural, educational, scientific and administrative resources, as well as technical, legal, medical and other kinds of information created digitally, or converted into digital form from existing analogue resources". Digital heritage also includes the use of digital media in the service of understanding and preserving cultural or natural heritage. The digitization of both cultural heritage and Natural heritage serves to enable the permanent access of current and future generations to culturally important objects ranging from literature and paintings to flora, fauna, or habitats. It is also used in the preservation and access of objects with enduring or significant historical, scientific, or cultural value including buildings, archeological sites, and natural phenomena. The main idea is the transformation of a material object into a virtual copy. It should not be confused with digital humanities, which uses digitizing technology to specifically help with research. There have been several debates concerning the efficiency of the process of digitizing heritage. Some of the drawbacks refer to the deterioration and technological obsolescence due to the lack of funding for archival materials and underdeveloped policies that would regulate such a process. Another main social debate has taken place around the restricted accessibility due to the digital divide that exists around the world. Nevertheless, new technologies enable easy, instant and cross boarder access to the digitized work. Many of these technologies include spatial and surveying technology to gain aerial or 3D images. Digital heritage is also used to monitor cultural heritage sites over years to help with preservation, maintenance, and sustainable tourism. It aims to observe any changes, diseases, or deterioration that may occur on objects. == Cultural and natural heritage == Digital Heritage that is not born-digital can be divided into two separate groups—digital cultural heritage and digital natural heritage. Digital cultural heritage is the maintenance or preservation of cultural objects through digitization. These are objects, in some cases entire cities, that are considered of cultural importance. These objects are sometimes able to be digitized or physically represented in minute detail. Digital cultural heritage also includes intangible heritage. These are things such as "oral traditions, customs, value systems, skills, traditional dances, diets, performances" and other unique features of a culture. Intangible heritage is particularly vulnerable to destruction due to urbanization. There are several projects and programs which concentrate on digital cultural heritage. One such project is Mapping Gothic France, which aims to document and preserve cathedrals across France using images, VR tours, laser scans, and panoramas. This allows for scientific and historical study and preservation of the cathedrals and also provides detailed access to the sites for anyone in the world. The aim of projects like these is to help with the preservation and restoration of cultural objects. After the fire at Notre-Dame de Paris in 2019, digital scans are a major component in the ongoing restoration. Digital natural heritage pertains to objects of natural heritage that are considered of cultural, scientific, or aesthetic importance. Digital heritage in this instance is used not only to grant access to these objects, but to monitor any changes over time, such as with plant or animal habitats. Geographic information systems are a form of technology that is used primarily in the study of natural heritage. Western Australia has one such digital heritage project where they have created a digital repository of native plants important to both the region and the Aboriginal people. This is in order to protect and preserve the important biological heritage of Western Australia. == Educational impact == The digitization of these heritage objects has impacts around the world and across many disciplines. The increase of digital items means that people, especially the youth, are able to learn about new objects and cultures online through various media. They provide viewers with a more in-depth experience with an item or place, instead of just an image. The media is also able to be curated to age- or educational-level appropriateness, making learning easier. Some of the technology used in education, especially in museums, includes mobile apps, virtual reality, social media, and video games. Cultural heritage institutions are using this technology to try to expand access, increase appreciation for these items, and to gain new viewpoints on their collections. Digital heritage also helps scientists, archeologists, or other historians and specialists collect data on these objects, providing more information on the objects and the past. Digital Heritage is still currently being studied and improved by several sectors invested in cultural and intellectual preservation. It is particularly of interest to museums, governments, and academic institutions. Research by these groups are creating new concepts, methodologies, and techniques for the implementation of digital heritage to protect this type of cultural and natural heritage. As new technologies are created, museums and other heritage institutions are provided with more ways of disseminating their information and engaging with the public. A lack of resources within certain groups may still hinder everyone from accessing digital heritage. == Technologies used == The digitization of cultural heritage is attained through several means. Some of the main technology used is spatial and surveying technology. Space archaeological technology - Observations from space satellites are non-intrusive and can be integrated with other technologies on the ground. It is used to photograph vast areas of earth and help with research. Remnants of ancient civilizations or other human objects are also able to be spotted via satellite imaging. Unmanned aerial vehicles - UAV, such as drones, are commonly used in digitization of cultural heritage objects. The Great Wall of China is one such site that has been digitized and analyzed through unmanned aerial vehicle investigation. The resulting images, 3-D scans, maps, and other data are used to evaluate and maintain the Great Wall. Laser Scanning - Laser scanning is used to scan an area and recreate spatially accurate depictions, such as a 3D model. Virtual and Augmented Reality - VR is used primarily for education but does have uses for reconstruction and research. It is used to provide users with an immersive experience, as though they are actually at the site. Geographic Information systems - GIS are used primarily to study objects and sites over time. It is also important in studying the socioeconomic status of the past. 3D Modeling - 3D modeling has become more widely used due to an increase in technology that works specifically with heritage sites. It is often used in tandem with GIS to reconstruct objects for restoration, documentation, preservation, and educational purposes. Data is collected using satellite or other aerial imaging and ground-based imaging. There is some concern about the accuracy and authenticity of these types of digital reconstructions and their effects on the sites themselves. A major barrier to digital heritage is the amount of resources it takes to undertake such projects, such as money, time, and technology. Money and the lack of qualified personnel are two that are considered the most obstructive. This is especially an issue in less developed areas or within underfunded groups such as minorities. == Virtual heritage == A particular branch of digital heritage, known as "virtual heritage", is formed by the use of information technology with the aim of recreating the experience of existing cultural heritage, as in (approximations of) virtual reality. It is hard to differentiate this branch from the core contribution of digital heritage which is storing the heritage data digitally. Parsinejad et al. developed two techniques for Digital Twinning of the architectural assets and representation of the physical assets virtually in the museum context. Two techniques are hand recording and digital recording and both have challenges in adoption and implementation of Digital Twin as a revolutionary concept. == Digital heritage stewardship == Digital heritage stewardship is a form of digital curation which is modeled after collaborative curation. Digital heritage stewardship means stepping away from typical curatorial practices (e.g. discovering, arranging, and sharing information, material, and/or content) in favor of practices which allow its stakeholders the opportunity to contribute historical, political, and social context and culture. The collaborative practice encourages the creation, engagement, and maintena

Telecommunications device for the deaf

A telecommunications device for the deaf (TDD) is a teleprinter, an electronic device for text communication over a telephone line, that is designed for use by persons with hearing or speech difficulties. Other names for the device include teletypewriter (TTY), textphone (common in Europe), and minicom (United Kingdom). The typical TDD is a device about the size of a typewriter or laptop computer with a QWERTY keyboard and small screen that uses an LED, LCD, or VFD screen to display typed text electronically. In addition, TDDs commonly have a small spool of paper on which text is also printed – old versions of the device had only a printer and no screen. The text is transmitted live, via a telephone line, to a compatible device, i.e. one that uses a similar communication protocol. Special telephone services have been developed to carry the TDD functionality even further. In certain countries, there are systems in place so that a deaf person can communicate with a hearing person on an ordinary voice phone using a human relay operator. There are also "carry-over" services, enabling people who can hear but cannot speak ("hearing carry-over", a.k.a. "HCO"), or people who cannot hear but are able to speak ("voice carry-over", a.k.a. "VCO") to use the telephone. The term TDD is sometimes discouraged because people who are deaf are increasingly using mainstream devices and technologies to carry out most of their communication. The devices described here were developed for use on the partially-analog Public Switched Telephone Network (PSTN). They do not work well on the new internet protocol (IP) networks. Thus as society increasingly moves toward IP based telecommunication, the telecommunication devices used by people who are deaf will not be TDDs. In the US and Canada, the devices are referred to as TTYs. Teletype Corporation, of Skokie, Illinois, made page printers for text, notably for news wire services and telegrams, but these used standards different from those for deaf communication, and although in quite widespread use, were technically incompatible. Furthermore, these were sometimes referred to by the "TTY" initialism, short for "Teletype". When computers had keyboard input mechanisms and page printer output, before CRT terminals came into use, Teletypes were the most widely used devices. They were called "console typewriters". (Telex used similar equipment, but was a separate international communication network.) == History == === APCOM acoustic coupler or MODEM device === The TDD concept was developed by James C. Marsters (1924–2009), a dentist and private airplane pilot who became deaf as an infant because of scarlet fever, and Robert Weitbrecht, a deaf physicist. In 1964, Marsters, Weitbrecht and Andrew Saks, an electrical engineer and grandson of the founder of the Saks Fifth Avenue department store chain, founded APCOM (Applied Communications Corp.), located in the San Francisco Bay area, to develop the acoustic coupler, or modem; their first product was named the PhoneType. APCOM collected old teleprinter machines (TTYs) from the Department of Defense and junkyards. Acoustic couplers were cabled to TTYs enabling the AT&T standard Model 500 telephone to couple, or fit, into the rubber cups on the coupler, thus allowing the device to transmit and receive a unique sequence of tones generated by the different corresponding TTY keys. The entire configuration of teleprinter machine, acoustic coupler, and telephone set became known as the TTY. Weitbrecht invented the acoustic coupler modem in 1964. The actual mechanism for TTY communications was accomplished electro-mechanically through frequency-shift keying (FSK) allowing only half-duplex communication, where only one person at a time can transmit. === Paul Taylor TTY device === During the late 1960s, Paul Taylor combined Western Union Teletype machines with modems to create teletypewriters, known as TTYs. He distributed these early, non-portable devices to the homes of many in the deaf community in St. Louis, Missouri. He worked with others to establish a local telephone wake-up service. In the early 1970s, these small successes in St. Louis evolved into the nation's first local telephone relay system for the deaf. === Micon Industries MCM device === In 1973, the Manual Communications Module (MCM), which was the world's first electronic portable TTY allowing two-way telecommunications, premiered at the California Association of the Deaf convention in Sacramento, California. The battery-powered MCM was invented and designed by a deaf news anchor and interpreter, Kit Patrick Corson, in conjunction with Michael Cannon and physicist Art Ogawa. It was manufactured by Michael Cannon's company, Micon Industries, and initially marketed by Kit Corson's company, Silent Communications. In order to be compatible with the existing TTY network, the MCM was designed around the five-bit Baudot code established by the older TTY machines instead of the ASCII code used by computers. The MCM was an instant success with the deaf community despite the drawback of a $599 cost. Within six months there were more MCMs in use by the deaf and hard of hearing than TTY machines. After a year Micon took over the marketing of the MCM and subsequently concluded a deal with Pacific Bell (who coined the term "TDD") to purchase MCMs and rent them to deaf telephone subscribers for $30 per month. After Micon formed an alliance with APCOM, Michael Cannon (Micon), Paul Conover (Micon), and Andrea Saks (APCOM) successfully petitioned the California Public Utilities Commission (CPUC), resulting in a tariff that paid for TTY devices to be distributed free of cost to deaf persons. Micon produced over 1,000 MCMs per month, resulting in approximately 50,000 MCMs being disseminated into the deaf community. Before he left Micon in 1980, Michael Cannon developed several computer compatible variations of the MCM and a portable, battery operated printing TTY, but they were never as popular as the original MCM. Newer model TTYs could communicate with selectable codes that allow communications at a higher bit rate on those models similarly equipped. However, the lack of true computer interface functionality spelled the demise of the original TTY and its clones. During the mid-1970s, other so-called portable telephone devices were being cloned by other companies, and this was the time period when the term "TDD" began being used largely by those outside the deaf community. === Text messaging and the Def-Tone System (DTS) === This relay system became known commonly as the Def-Tone System (DTS) because the tones representing letters of the alphabet were eventually carried in tones outside the range of human hearing. Today, this is commonly called multi-tap because you press a number 1, 2 or 3 times to get a corresponding letter. In 1994 Joseph Alan Poirier, a college student-worker, recommended using the system to send texts to forklifts to improve delivery of parts to the assembly line at GM Powertrain in Toledo, Ohio, and sending a text to pagers. He recommended taking pagers to alphanumeric displays incorporating the same system in discussions with the pager supplier for Outback Steakhouse and having relays put in the forklifts to ping alert messages to the pagers used in that system. He called it text messaging, coining the phrase. It is theorized that when Toyota forklift was allegedly hired by GM for this work, one of the subcontractors, Kyocera, utilized the work for the Toyota forklift company to create text messaging for cell phones. === Marsters Award === In 2009, AT&T received the James C. Marsters Promotion Award from TDI (formerly Telecommunications for the Deaf, Inc.) for its efforts to increase accessibility to communication for people with disabilities. The award holds some irony; it was AT&T that, in the 1960s, resisted efforts to implement TTY technology, claiming it would damage its communication equipment. In 1968, the Federal Communications Commission struck down AT&T's policy and forced it to offer TTY access to its network. == Protocols == There are many different standards for TDDs and textphones. === Original 5-bit Baudot code === The original standard used by TTYs is a variant of the Baudot code. The maximum speed of this protocol is 10 characters per second. This is a half-duplex protocol, which means that only one person at a time may transmit characters. If both try to transmit at the same time, the characters will be garbled on the other end. This protocol is commonly used in the United States. This is a variant of the Baudot code, implemented as 5-bits per character transmitted asynchronously using frequency-shift key-modulation at either 45.5 or 50 baud, 1 start bit, 5 data bits, and 1.5 stop bits. Details of the protocol implementation are available in TIA-825-A and also in T-REC V.18 Annex A "5-bit operational mode". === Turbo Code === The UltraTec company implements another protocol known as Enh

Texas House Bill 20

An Act Relating to censorship of or certain other interference with digital expression, including expression on social media platforms or through electronic mail messages, also known as Texas House Bill 20 (HB20), is a Texas anti-deplatforming law enacted on September 9, 2021. It prohibits large social media platforms from removing, moderating, or labeling posts made by users in the state of Texas based on their "viewpoints", unless considered illegal under federal law or otherwise falling into exempted categories. It also requires them to make various public disclosures relating to their business practices (including the impact of algorithmic and moderation decisions on the content that is delivered to users). The bill is part of a wider array of Republican-backed legislation seeking to prohibit the censorship of political speech, based on allegations that the moderation policies of large social media platforms are not politically neutral. It has been challenged in NetChoice, LLC v. Paxton, and is currently the subject of a circuit split between the Fifth Circuit, and a decision by the Eleventh Circuit that struck down a similar bill in the state of Florida. In September 2023, the U.S. Supreme Court agreed to hear NetChoice v. Paxton jointly with NetChoice v. Moody on questions of whether the Florida and Texas state laws are in compliance with the 1st Amendment. == Content == The law applies to "social media platforms" that serve users in the state of Texas, and have more than 50 million monthly active users in the United States. They are defined as any public internet website or application that allows users to "communicate with other users for the primary purpose of posting information, comments, messages, or images", excluding internet service providers, electronic mail, and services where communication features are "incidental to, directly related to, or dependent on" content that is pre-selected by the operator. In the bill, to "censor" is defined as to "block, ban, remove, deplatform, demonetize, de-boost, restrict, deny equal access or visibility to, or otherwise discriminate against" expression. The law prohibits social media platforms from "censoring on the basis of user viewpoint, user expression, or the ability of a user to receive the expression of others", or on the basis of a user's geographic location in Texas. This includes removal or labeling posts with warnings and disclaimers. Social media platforms may only censor content if it is unlawful, they are "specifically authorized" to do so by federal law, based on requests from "an organization with the purpose of preventing the sexual exploitation of children or protecting survivors of sexual abuse from ongoing harassment", or "directly incites" criminal activity or contains threats of violence against persons based on protected categories. It is disputed over whether this provision is actually enforceable, as it may be preempted by Section 230 of the Communications Decency Act (which states that the operators of interactive computer services are not responsible for the actions of their users). Social media platforms must make public disclosures regarding the algorithmic techniques and moderation polices that are used to determine the content provided to users, must publish a compliant acceptable use policy (AUP), and must publish a biannual transparency report containing specific details on all actions made by the service regarding the moderation of users and content. The law also prohibits email providers from "intentionally imped[ing] the transmission of another person's electronic mail message based on the content." == Legislative history == Texas Governor Greg Abbott signed the bill into law on September 9, 2021. Democrat-proposed amendments excluding Holocaust denial, terrorism content, and vaccine misinformation from the bill were rejected. Following a suit by the industry groups Computer & Communications Industry Association (CCIA) and NetChoice, NetChoice, LLC v. Paxton, the bill was blocked by U.S. District Judge Robert Pitman in December 2021, on First Amendment grounds. Texas appealed to the United States Court of Appeals for the Fifth Circuit. Judges Edith Jones, Andrew Oldham, and Leslie H. Southwick, lifted the injunction on May 11, 2022, but the decision was appealed to the Supreme Court which suspended the bill pending a full review in the Fifth Circuit. On September 16, 2022, the Fifth Circuit reversed the injunction, allowing the bill to take effect; Judge Oldham stated that the bill "chills censorship" and "does not chill speech", and accused the plaintiffs of "attempt[ing] to extract a freewheeling censorship right from the Constitution's free speech guarantee. The Platforms are not newspapers. Their censorship is not speech." Southwick dissented, stating that "we are in a new arena, a very extensive one, for speakers and for those who would moderate their speech. None of the precedents fit seamlessly." The CCIA and NetChoice requested a stay on the ruling and that the case be taken to the Supreme Court, arguing that the reversal conflicts with an Eleventh Circuit decision in NetChoice v. Moody which struck down a similar anti-moderation bill imposed by the state of Florida. On October 12, 2022, the Fifth Circuit granted the stay.

Operation Serenata de Amor

Operation Serenata de Amor is an artificial intelligence project designed to analyze public spending in Brazil. The project has been funded by a recurrent financing campaign since September 7, 2016, and came in the wake of major scandals of misappropriation of public funds in Brazil, such as the Mensalão scandal and what was revealed in the Operation Car Wash investigations. The analysis began with data from the National Congress then expanded to other types of budget and instances of government, such as the Federal Senate. The project is built through collaboration on GitHub and using a public group with more than 600 participants on Telegram. The name "Serenata de Amor," which means "serenade of love," was taken from a popular cashew cream bonbon produced by Chocolates Garoto in Brazil. == Modules == Throughout development of the project, new modules have been newly introduced in addition to the main repository: The main repository, serenata-de-amor, serves as the starting point for investigative work. Rosie is the robot programmed to identify public funds expenses with discrepancies, starting with CEAP (Quota for Exercise of Parliamentary Activity); it analyzes each of the reimbursements requested by the deputies and senators, indicating the reasons that lead it to believe they are suspicious. From Rosie was born whistleblower, which tweets under the name of @RosieDaSerenata, distributing the results found on social media. Jarbas (Github repository) is a data visualization tool which shows a complete list of reimbursements made available by the Chamber of Deputies and mined by Rosie. Toolbox is a Python installable package that supports the development of Serenata de Amor and Rosie. == History == Operation Serenata de Amor is an Artificial intelligence project for analysis of public expenditures. It was conceived in March 2016 by data scientist Irio Musskopf, sociologist Eduardo Cuducos and entrepreneur Felipe Cabral. The project was financed collectively in the Catarse platform, where it reached 131% of the collection goal paying 3 months of project development. Ana Schwendler, also a data scientist, Pedro Vilanova "Tonny", data journalist, Bruno Pazzim, software engineer, Filipe Linhares, a frontend engineer, Leandro Devegili, an entrepreneur and André Pinho took the first steps towards constructing the platform, such as collecting and structuring the first datasets. Jessica Temporal, data scientist and Yasodara Córdova "Yaso", researcher, Tatiana Balachova "Russa", UX designer, joined the project after the financing took place. The members created a recurring financing campaign, expanding the analysis of public spending to the Federal Senate. Donors make monthly payments ranging from 5 BRL to 200 BRL to maintain group activities. The monthly amount collected is around 10,000 BRL. == Results == In January 2017, concluding the period financed by the initial campaign, the group carried out an investigation into the suspicious activities found by the data analysis system. 629 complaints were made to the Ombudsman's Office of the Chamber of Deputies, questioning expenses of 216 federal deputies. In addition, the Facebook project page has more than 25,000 followers, and users frequently cite the operation as a benchmark in transparency in the Brazilian government. One of the examples of results obtained by the operation is the case of the Deputy who had to return about 700 BRL to the House after his expenses were analyzed by the platform. The platform was able to analyze more than 3 million notes, raising about 8,000 suspected cases in public spending. The community that supports the work of the team benefits from open source repositories, with licenses open for the collaboration. So much so that the two main data scientists of the project presented it at the CivicTechFest in Taipei, obtaining several mentions even in the international press. The technical leader presented the project in Poland during DevConf2017 in Kraków. It was also presented in the Google News Lab in 2017. It was presented by Yaso, when she was the Director of the initiative, at the MIT Media Lab/Berkman Klein Center Initiative for Artificial Intelligence ethics, and at the Artificial Intelligence and Inclusion Symposium, an initiative of the Global Network of Internet & Society Centers (NoC). It was also presented both by Irio and Yaso at the Digital Harvard Kennedy School, over a lunch seminar, where the transparency of the platform and the main solutions found were discussed, so that the code and data are always available to verify its suitability. This infographic provides information about the first results of Operation Serenata de Amor, a project that analyzes open data on public spending to find discrepancies. The project was presented by Yaso to the House Audit and Control Committee of the Chamber of Deputies in August 2017, and raised the interest of House officials who work with open data. The operation has been a source of inspiration for other civic projects that aim to work with similar goals, demonstrating the broader impact of artificial intelligence also in industry in Brazil. Participation of several team members in events throughout Brazil and abroad can be found on the Internet, such as presentation at OpenDataDay, held at Calango Hackerspace in the Federal District, Campus Party Bahia, Campus Party Brasilia, Friends of Tomorrow, XIII National Meeting of Internal Control, in the event USP Talks Hackfest against corruption in João Pessoa, the latter being also highlighted in the National Press.

MADI

Multichannel Audio Digital Interface (MADI) standardized as AES10 by the Audio Engineering Society (AES) defines the data format and electrical characteristics of an interface that carries multiple channels of digital audio. The AES first documented the MADI standard in AES10-1991 and updated it in AES10-2003 and AES10-2008. The MADI standard includes a bit-level description and has features in common with the two-channel AES3 interface. MADI supports serial digital transmission over coaxial cable or fibre-optic lines of 28, 56, 32, or 64 channels; and sampling rates to 96 kHz and beyond with an audio bit depth of up to 24 bits per channel. Like AES3 and ADAT Lightpipe, it is a unidirectional interface from one sender to one receiver. == Development and applications == MADI was developed by AMS Neve, Solid State Logic, Sony and Mitsubishi and is widely used in the audio industry, especially in the professional audio sector. It provides advantages over other audio digital interface protocols and standards such as AES3, ADAT Lightpipe, TDIF (Tascam Digital Interface), and S/PDIF (Sony/Philips Digital Interface). These advantages include: Support for a greater number of channels per line Use of coaxial and optical fiber media that support transmission of audio signals over 100 meters, up to 3000 meters over multi-mode and 40,000 meters over single-mode optical fiber The original specification (AES10-1991) defined the MADI link as a 56-channel transport for linking large-format mixing consoles to digital multitrack recording devices. Large broadcast studios also adopted it for routing multi-channel audio throughout their facilities. The 2003 revision (AES10-2003) adds a 64-channel capability by removing varispeed operation and supports 96 kHz sampling frequency with reduced channel capacity. The latest AES10-2008 standard includes minor clarifications and updates to correspond to the current AES3 standard. Audio over Ethernet of various types is the primary alternative to MADI for transport of many channels of professional digital audio. == Transmission format == MADI links use a transmission format similar to Fiber Distributed Data Interface (FDDI) networking. Since MADI is most often transmitted on copper links via 75-ohm coaxial cables, it more closely compares to the FDDI specification for copper-based links, called CDDI. AES10-2003 recommends using BNC connectors with coaxial cables and SC connectors with optic fibers. MADI over fibre can support a range of up to 2 km. The basic data rate is 100 Mbit/s of data using 4B5B encoding to produce a 125 MHz physical baud rate. Unlike AES3, this clock is not synchronized to the audio sample rate, and the audio data payload is padded using JK sync symbols. Sync symbols may be inserted at any subframe boundary, and must occur at least once per frame. Though the standard disassociates the transmission clock from the audio sample rate, and thus requires a separate word clock connection to maintain synchronization, some vendors do give the option of locking to parts of the transmission timing information for purposes of deriving a word clock. The audio data is almost identical to the AES3 payload, though with more channels. Rather than letters, MADI assigns channel numbers from 0–63. Frame synchronization is provided by sync symbols outside the data itself, rather than an embedded preamble sequence, and the first four time slots of each sub-channel are encoded as normal data, used for sub-channel identification: Bit 0: Set to 1 to mark channel 0, the first channel in each frame Bit 1: Set to 1 to indicate that this channel is active (contains interesting data) Bit 2: notA/B channel marker, used to mark left (0) and right (1) channels. Generally, even channels are A and odd channels are B. Bit 3: Set to 1 to mark the beginning of a 192-sample data block == Sampling frequency == The original AES10-1991 specification allowed 56 channels at sample rates from 32 to 48 kHz with an additional vari-speed range of ± 12.5%. This leads to a total range of 28 to 54 kHz. At the highest frequency, this produces a total of 56 × 32 × 54 = 96768 kbit/s, leaving 3.232% of the channel for synchronization marks and transmit clock error. The 2003 revision specifies different relations between sampling frequency and number of channels. 32 kHz to 48 kHz ± 12.5%, 56 channels; 32 kHz to 48 kHz nominal, 64 channels; 64 kHz to 96 kHz ± 12.5%, 28 channels. With a 48 kHz sampling frequency, 64 channels take 64 × 32 × 48000 = 98.304 Mbit/s. Adding the minimum 8 × 58 kbit/s of framing produces 98688 bit/s, leaving 1.312% free for timing variation and other overhead. Both versions of the standard accommodate higher sampling frequencies (for example, 96 kHz or 192 kHz) by using two or more channels per audio sample on the link.