BulSemCor

BulSemCor

The Bulgarian Sense-annotated Corpus (BulSemCor) (Bulgarian: Български семантично анотиран корпус (БулСемКор)) is a structured corpus of Bulgarian texts in which each lexical item is assigned a sense tag. BulSemCor was created by the Department of Computational Linguistics at the Institute for Bulgarian Language of the Bulgarian Academy of Sciences. == Structure == BulSemCor was created as part of a nationally funded project titled "BulNet – A lexico-semantic network for the Bulgarian Language" (2005–2010). It follows the general methodology of SemCor combined with some specific principles. The corpus for annotation consists of 101,791 tokens covering an excerpt from the Bulgarian "Brown" Corpus modelled on the Brown Corpus.Francis Kucera An important feature of BulSemCor is that the samples are selected using heuristics that provide optimal coverage of ambiguous lexis. BulSemCor is manually sense-annotated according to the Bulgarian WordNet. Its size is comparable to that of other contemporary semantically annotated corpora or pool of acceptable linguistic components. The semantic annotation consists in associating each lexical item in the corpus with exactly one synonym set (synset) in the Bulgarian WordNet that best describes its sense in the particular context. The selection of the best match among the suggested candidates is based on a set of procedures, such as the other synset members, the synset gloss (explanatory definition) and the position of a given candidate in the WordNet structure. == Scale == The number of annotated tokens is 99,480 (the difference in the number of tokens compared to the initial corpus is due to the fact that some of them are not linguistic items). The simple word count is 86,842 and multiword expressions (MWE) are 5,797 (12,638 tokens). == Specific features == All words in BulSemCor are assigned a sense, while according to established practice only simple content words or content word classes (typically nouns and verbs) are annotated. Since 2000 the development of language resources, has broadened to include annotation of function words and multiword expressions covering particular senses or types of words and expressions. In this respect, BulSemCor's annotation is more exhaustive and hence provides greater opportunities for linguistic observations and non-linear programming (NLP) applications. Annotated items inherit the linguistic information associated with the corresponding synset, which along with morphological and semantic tags may include annotation on one or more of the following additional levels: Partial information about the syntactic structure of MWE types – particularly, information about syntactic heads and their dependents; Information about the category of the named entities – names, locations, organisations, dates, numbers, etc.; Information about the taxonomic category of adverbs, such as time, place, manner, degree, quantity, etc.; Information about the type of the syntactic relationships – coordination or subordination – expressed by conjunctions; Information about the original part-of-speech of substantivised words (non-nouns that act as nouns in a particular context); Stylistic/register, grammatical and other information about synsets or individual synset members;

Coherent extrapolated volition

Coherent extrapolated volition (CEV) is a theoretical framework in the field of AI alignment describing an approach by which an artificial superintelligence (ASI) would act on a benevolent supposition of what humans would want if they were more knowledgeable, more rational, had more time to think, and had matured together as a society, as opposed to humanity's current individual or collective preferences. It was proposed by Eliezer Yudkowsky in 2004 as part of his work on friendly AI. == Concept == CEV proposes that an advanced AI system should derive its goals by extrapolating the idealized volition of humanity. This means aggregating and projecting human preferences into a coherent utility function that reflects what people would desire under ideal epistemic and moral conditions. The aim is to ensure that AI systems are aligned with humanity's true interests, rather than with transient or poorly informed preferences. In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted. == Debate == Yudkowsky and Nick Bostrom note that CEV has several interesting properties. It is designed to be humane and self-correcting, by capturing the source of human values instead of trying to list them. It avoids the difficulty of laying down an explicit, fixed list of rules. It encapsulates moral growth, preventing flawed current moral beliefs from getting locked in. It limits the influence that a small group of programmers can have on what the ASI would value, thus also reducing the incentives to build ASI first. And it keeps humanity in charge of its destiny. CEV also faces significant theoretical and practical challenges. Bostrom notes that CEV has "a number of free parameters that could be specified in various ways, yielding different versions of the proposal." One such parameter is the extrapolation base (whose extrapolated volition is taken into account). For example, whether it should include people with severe dementia, patients in a vegetative state, foetuses, or embryos. He also notes that if CEV's extrapolation base only includes humans, there is a risk that the result would be ungenerous toward other animals and digital minds. One possible solution would be to include a mechanism to expand CEV's extrapolation base. == Variants and alternatives == A proposed theoretical alternative to CEV is to rely on an artificial superintelligence's superior cognitive capabilities to figure out what is morally right, and let it act accordingly. It is also possible to combine both techniques, for instance with the ASI following CEV except when it is morally impermissible. In another review, a philosophical analysis explores CEV through the lens of social trust in autonomous systems. Drawing on Anthony Giddens' concept of "active trust", the author proposes an evolution of CEV into "Coherent, Extrapolated and Clustered Volition" (CECV). This formulation aims to better reflect the moral preferences of diverse cultural groups, thus offering a more pragmatic ethical framework for designing AI systems that earn public trust while accommodating societal diversity.

METEO System

The METEO System is a machine translation system specifically designed for the translation of the weather forecasts issued daily by Environment Canada. The system was used from 1981 to 30 September 2001 by Environment Canada to translate forecasts issued in French in the province of Quebec into English and those issued in English in other Canadian provinces into French. Since then, a competitor program has replaced METEO System after an open governmental bid. The system was developed by John Chandioux and was often mentioned as one of the few success stories in the field of machine translation. == History == The METEO System was in operational use at Environment Canada from 1982 to 2001. It stems from a prototype developed in 1975–76 by the TAUM Group, known as TAUM-METEO. The initial motivation to develop that prototype was that a junior translator came to TAUM to ask for help in translating weather bulletins at Environment Canada. Since all official communications emanating from the Canadian government must be available in French and English, because of the Official Languages Act of 1969, and weather bulletins represent a large amount of translation in real time, junior translators had to spend several months producing first draft translations, which were then revised by seniors. That was a difficult and tedious job, because of the specificities of the English and French sublanguages used, and not very rewarding, as the lifetime of a bulletin is only 4 hours. TAUM proposed to build a prototype MT system, and Environment Canada agreed to fund the project. A prototype was ready after a few months, with basic integration in the workflow of translation (source and target bulletins travelled over telex lines at the time and MT happened on a mainframe computer). The first version of the system (METEO 1) went into operation on a Control Data CDC 7600 supercomputer in March 1977. Chandioux then left the TAUM group to manage its operation and improve it, while the TAUM group embarked on a different project (TAUM-aviation, 1977–81). Benoit Thouin made improvements to the initial prototype over the subsequent year, and turned it into an operational system. After three years, METEO 1 had demonstrated the feasibility of microcomputer-based machine translation to the satisfaction of the Canadian government's Translation Bureau of Public Works and Government Services Canada. METEO 1 was formally adopted in 1981, replacing the junior translators in the workflow. Because of the need for high-quality translation, the revision step, done by senior translators, was maintained. The quality, measured as the percentage of edit operations (inserting or deleting a word counts as 1, replacing as 2) on the MT results, reached 85% in 1985. Until that time, the MT part was still implemented as a sequence of Q-systems. The Q-systems formalism is a rule-based SLLP (Specialized Language for Linguistic Programming) invented by Alain Colmerauer in 1967 as he was a postdoc coopérant at the TAUM group. He later invented the Prolog language in 1972 after returning to France and becoming a university professor in Marseille-Luminy. As the engine of the Q-systems is highly non-deterministic, and the manipulated data structures are in some ways too simple, without any types such as string or number, Chandioux encountered limitations in his efforts to raise translation quality and lower computation time to the point he could run it on microcomputers. In 1981, Chandioux created a new SLLP, or metalanguage for linguistic applications, based on the same basic algorithmic ideas as the Q-systems, but more deterministic, and offering typed labels on tree nodes. Following the advice of Bernard Vauquois and Colmerauer, he created GramR, and developed it for microcomputers. In 1982, he could start developing in GramR a new system for translating the weather bulletins on a high-end Cromemco microcomputer. METEO 2 went into operation in 1983. The software then ran in 48Kb of central memory with a 5Mb hard disk for paging. METEO 2 was the first MT application to run on a microcomputer. In 1985, the system had nothing left of the initial prototype, and was officially renamed METEO. It translated about 20 million words per year from English into French, and 10 million words from French into English, with a quality of 97%. Typically, it took 4 minutes for a bulletin in English to be sent from Winnipeg and come back in French after MT and human revision. In 1996, Chandioux developed a special version of his system (METEO 96) which was used to translate the weather forecasts (different kinds of bulletins) issued by the US National Weather Service during the 1996 Summer Olympics in Atlanta. The last known version of the system, METEO 5, dates from 1997 and ran on an IBM PC network under Windows NT. It translated 10 pages per second, but was able to fit into a 1.44Mb floppy disk.

The Best Free AI Customer-support Bot for Beginners

Shopping for the best AI customer-support bot? An AI customer-support bot is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI customer-support bot slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

Scott Fahlman

Scott Elliott Fahlman (born March 21, 1948) is an American computer scientist and Professor Emeritus at Carnegie Mellon University's Language Technologies Institute and Computer Science Department. He is notable for early work on automated planning and scheduling in a blocks world, on semantic networks, on neural networks (especially the cascade correlation algorithm), on the programming languages Dylan, and Common Lisp (especially CMU Common Lisp), and he was one of the founders of Lucid Inc. During the period when it was standardized, he was recognized as "the leader of Common Lisp." From 2006 to 2015, Fahlman was engaged in developing a knowledge base named Scone, based in part on his thesis work on the NETL Semantic Network. He also is credited with coining the use of the emoticon. == Life and career == Fahlman was born in Medina, Ohio, the son of Lorna May (Dean) and John Emil Fahlman. He attended the Massachusetts Institute of Technology (MIT), where he received a Bachelor of Science (B.S.) and Master of Science (M.S.) degree in electrical engineering and computer science in 1973, and a Doctor of Philosophy (Ph.D.) in artificial intelligence in 1977. He has noted that his doctoral diploma says the degree was awarded for "original research as demonstrated by a thesis in the field of Artificial Intelligence" and suggested that it may be the first doctorate to use that term. He is a fellow of the American Association for Artificial Intelligence. Fahlman acted as thesis advisor for Donald Cohen, David B. McDonald, David S. Touretzky, Skef Wholey, Justin Boyan, Michael Witbrock, and Alicia Tribble Sagae. From May 1996 to July 2001, Fahlman directed the Justsystem Pittsburgh Research Center. === Boltzmann Machine (1983) === In 1983, Fahlman, Geoffrey Hinton, and Terry Sejnowski published a paper in Proceedings of the AAAI-83 Conference, Washington DC, August 1983. The paper was titled as "Massively Parallel Architectures for AI: NETL, Thistle and Boltzmann Machines". === Emoticons === Fahlman was not the first to suggest the concept of the emoticon – a similar concept for a marker appeared in an article of Reader's Digest in May 1967, although that idea was never put into practice. In an interview printed in The New York Times in 1969, Vladimir Nabokov noted: "I often think there should exist a special typographical sign for a smile – some sort of concave mark, a supine round bracket." Fahlman is credited with originating the first smiley emoticon, which he thought would help people on a message board at Carnegie Mellon to distinguish serious posts from jokes. He proposed the use of :-) and :-( for this purpose, and the symbols caught on. The original message from which these symbols originated was posted on 19 September 1982. The message was recovered by Jeff Baird on 10 September 2002 and read: 19-Sep-82 11:44 Scott E Fahlman :-) From: Scott E Fahlman I propose that the following character sequence for joke markers: :-) Read it sideways. Actually, it is probably more economical to mark things that are NOT jokes, given current trends. For this, use :-(

Cybersecurity in space

Cybersecurity in space involves the defense of all space assets (e.g. navigation systems, satellites, ground antennas, networks, etc.). The security of space can be affected by attacks such as disruption, corruption as well as the destruction of depended-upon assets/collected data. Government (e.g. militaries) and non-government sectors (e.g. financial industries) have started to become more reliant on numerous space-based services. Due to the criticality of these services, space security experts have identified these assets as high-value targets (HVT) that can cause detrimental consequences to all of Earth. == Scope and definitions == Space assets are broken down by three sub-sectors: the space component, the ground component, and the individual user component. The architecture of space assets is extremely complex and allows for a frequent attack vector utilized, the disruption by radio frequency (RF) cyber-attacks. In 2020, a memorandum was published by President Donald Trump, Space Policy Directive‑5 (SPD‑5). It established principles to ensure the safeguarding of all space assets. In 2023, the National Institute of Standards and Technology’s (NIST) published IR 8270, Introduction to Cybersecurity for Commercial Satellite Operations. This report established a baseline risk-management framework (RMF) to be implemented into space operations. == History == During the Cold War in the 1950s-1960s, the United States and Russia entered what was called the “Space Race”. By 1957, the Soviet Union successfully launched the first satellite into space named Sputnik. By 1961, the first key milestone was accomplished when the Soviet Union’s Yuri Gagarin became the first human to orbit Earth. This was later followed by the first American, Alan Shepard, to be launched into space; this was followed by John Glenn becoming the first American to orbit Earth in 1962. In 1969, a pinnacle milestone was reached when Apollo 11 launched into space and Neil Armstrong became the first man to walk on the moon. As space operations furthered, Commercial off-the-shelf products became increasingly popular but resulted in a rapid increase to the cyber-attack surface. Public awareness of space security did not increase until 2022, when the Viasat KA-SAT incident occurred, resulting in the disruption of a large number of modems across Europe. The attack was later accredited to Russia by the U.S. and the U.K. Policy and standards started to rapidly increase by 2020. The establishment of SPD-5 was released in 2020 followed by asset hardening instructions in 2022, and NIST’s IR 8270 in 2023. It was not until 2025 that Europe published their own findings in the Space Threat Landscape 2025 Report. This document led to the EU’s security proposals and standards. == Threats == === Radio-frequency Interference and Global Navigation Satellite Systems (GNSS) Spoofing === Space services are highly dependent on RF links for systems such as GNSS, however, a consequence of this dependency on RF is denial of service and deception. In 2017, the Black Sea maritime event occurred when numerous ships were subject to spoofing. Space services depend on RF links susceptible to jamming (denial) and spoofing (deception), including for GNSS/Positioning, Navigation, and Timing (PNT). Annotated incidents include the 2017 Black Sea maritime spoofing event affecting numerous ships, and extensive aviation GNSS spoofing patterns surveyed in various regions during 2024–2025. === Network intrusion and malware === Cyber threats can intrude and infect assets with malware. They do this by finding misconfiguration vulnerabilities, remote-management interfaces, and/or supply-chain vulnerabilities mainly in ground networks and user terminals. When KA-SAT occurred, it resulted from bulk modem disturbances. Forensic analysts later suggested malicious management controls and wiper malware as the root cause. === Supply-chain and lifecycle risks === The outsource of COTS components, external vendors, and software defined payloads allowed for vulnerabilities to emerge in the System/Product Lifecycle. In response, EU recommended the implementation of lifecycle-wide controls as mitigating factors. === Espionage, disruption, and influence === As Advanced Persistent Threats (APTs), Global Positioning System (GPS) intervention, and information warfare increased, assets like transponders became more frequent targets of attack. == Noteworthy incidents == The Viasat KA‑SAT incident of 2022, where a large number of modems in Europe were disrupted, resulted in the loss of telemetry access to a significant amount of wind turbines in Germany. The mass GNSS deception of the Black Sea in 2017 affected numerous ships when they started to convey fake central locations in Russia. Between 2024 and 2025, there was a mass, repetitive aviation GNSS spoofing that affected the aircraft of various regions. == Standards, guidelines, and best practices == SPD‑5 (U.S.) – This established risk-based engineering, verifying and ensuring positive control, and the implementation of risk mitigation controls. NIST IR 8270 – This created a RMF for COTS satellites. CISA/FBI SATCOM Advisory (AA22‑076) – Provided guidance on hardening techniques such as least-privileged, access control, encryption, etc.). ENISA Space Threat Landscape 2025 – It established the categorization of assets to organize threats, ensuring the observation of system/product lifecycle, and an RMF for COTS satellites. ECSS‑E‑ST‑80C (2024) – This established a standard for securing lifecycles in space, covering all segments (e.g. ground, launch, etc.). == Regulation and governance == As of 2025, there is no international regulations established for space assets, but the U.S., EU, and ESA institutional initiatives have published standards to address security concerns. The U.S. implemented SPD-5 and the Federal Communications Commission (FCC); the FCC addressed orbital debris. While the EU created standards to address technological mandates and support the implementation of NIS2. Lastly, the ESA created a special operations center to safeguard their satellites. International governance is still evolving, but forums have been held by the United Nations Committee on the Peaceful Uses of Outer Space. International conversations under forums such as the UN Committee on the Peaceful Uses of Outer Space (COPUOS) progressively note the cyber–space safety relationship, though formal global norms specific to space cybersecurity continue evolving. == Risk management approaches == Through RMF, mitigation controls have been implemented to reduce the risk of exploitation while increasing the security of space. Controls addressing mitigation include proper configuration, system hardening, zero-trust architectures, encryption, etc. Both the government and industries have placed an emphasis on incident response procedures to identify, contain, and remediate breaches.

Top 10 AI Paragraph Rewriters Compared (2026)

Trying to pick the best AI paragraph rewriter? An AI paragraph rewriter is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI paragraph rewriter slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.