Example-based machine translation

Example-based machine translation

Example-based machine translation (EBMT) is a method of machine translation often characterized by its use of a bilingual corpus with parallel texts as its main knowledge base at run-time. It is essentially a translation by analogy and can be viewed as an implementation of a case-based reasoning approach to machine learning. == Translation by analogy == At the foundation of example-based machine translation is the idea of translation by analogy. When applied to the process of human translation, the idea that translation takes place by analogy is a rejection of the idea that people translate sentences by doing deep linguistic analysis. Instead, it is founded on the belief that people translate by first decomposing a sentence into certain phrases, then by translating these phrases, and finally by properly composing these fragments into one long sentence. Phrasal translations are translated by analogy to previous translations. The principle of translation by analogy is encoded to example-based machine translation through the example translations that are used to train such a system. Other approaches to machine translation, including statistical machine translation, also use bilingual corpora to learn the process of translation. == History == Example-based machine translation was first suggested by Makoto Nagao in 1984. He pointed out that it is especially adapted to translation between two totally different languages, such as English and Japanese. In this case, one sentence can be translated into several well-structured sentences in another language, therefore, it is no use to do the deep linguistic analysis characteristic of rule-based machine translation. == Example == Example-based machine translation systems are trained from bilingual parallel corpora containing sentence pairs like the example shown in the table above. Sentence pairs contain sentences in one language with their translations into another. The particular example shows an example of a minimal pair, meaning that the sentences vary by just one element. These sentences make it simple to learn translations of portions of a sentence. For example, an example-based machine translation system would learn three units of translation from the above example: How much is that X ? corresponds to Ano X wa ikura desu ka. red umbrella corresponds to akai kasa small camera corresponds to chiisai kamera Composing these units can be used to produce novel translations in the future. For example, if we have been trained using some text containing the sentences: President Kennedy was shot dead during the parade. and The convict escaped on July 15th., then we could translate the sentence The convict was shot dead during the parade. by substituting the appropriate parts of the sentences. == Phrasal verbs == Example-based machine translation is best suited for sub-language phenomena like phrasal verbs. Phrasal verbs have highly context-dependent meanings. They are common in English, where they comprise a verb followed by an adverb and/or a preposition, which are called the particle to the verb. Phrasal verbs produce specialized context-specific meanings that may not be derived from the meaning of the constituents. There is almost always an ambiguity during word-to-word translation from source to the target language. As an example, consider the phrasal verb "put on" and its Hindustani translation. It may be used in any of the following ways: Ram put on the lights. (Switched on) (Hindustani translation: Jalana) Ram put on a cap. (Wear) (Hindustani translation: Pahenna)

Automated negotiation

Automated negotiation is a form of interaction in systems that are composed of multiple autonomous agents, in which the aim is to reach agreements through an iterative process of making offers. Automated negotiation can be employed for many tasks human negotiators regularly engage in, such as bargaining and joint decision making. The main topics in automated negotiation revolve around the design of protocols and negotiating strategies. == History == Through digitization, the beginning of the 21st century has seen a growing interest in the automation of negotiation and e-negotiation systems, for example in the setting of e-commerce. This interest is fueled by the promise of automated agents being able to negotiate on behalf of human negotiators, and to find better outcomes than human negotiators. == Examples == Examples of automated negotiation include: Online dispute resolution, in which disagreements between parties are settled. Sponsored search auction, where bids are placed on advertisement keywords. Content negotiation, in which user agents negotiate over HTTP about how to best represent a web resource. Negotiation support systems, in which negotiation decision-making activities are supported by an information system.

Private message

In computer networking, a private message (PM), or direct message (DM), refers to a private communication, often text-based, sent or received by a user of a private communication channel on any given platform. Unlike public posts, PMs are only viewable by the participants. Long a function present on IRCs and Internet forums, private channels for PMs have also been prevalent features on instant messaging (IM) and on social media networks. It may be either synchronous (e.g. on an IM) or asynchronous (e.g. on an Internet forum). The term private message (PM) originated as a feature on internet forums, while the term direct message (DM) originated as a feature on Twitter. Due to the popularity of the latter service, DM has since been appropriated by other platforms, such as Instagram, and is often genericized in popular usage. == Overview == There are two main types of private messages, and one obscure type: One type includes those found on IRCs and Internet forums, as well as on social media services like Twitter, Facebook, and Instagram, where the focus is public posting, PMs allow users to communicate privately without leaving the platform. The second type are those relayed through instant messaging platforms such as WhatsApp and Snapchat, where users join the networks primarily to exchange PMs. A third type, peer-to-peer messaging, occurs when users create and own the infrastructure used to transmit and store the messages; while features vary depending on application, they give the user full control over the data they transmit. An example of software that enables this kind of messaging is Classified-ads. Besides serving as a tool to connect privately with friends and family, PMs have gained momentum in the workplace. Working professionals use PMs to reach coworkers in other spaces and increase efficiency during meetings. Although useful, using PMs in the workplace may blur the boundary between work and private lives. Some common forms of private messaging today include Facebook messaging (sometimes referred to as "inboxing"), Twitter direct messaging, and Instagram direct messaging. These forms of private messaging provide a private space on a usually public site. For instance, most activity on Twitter is public, but Twitter DMs provide a private space for communication between two users. This differs from mediums like email, texting, and Snapchat, where most or all activity is always private. Modern forms of private messaging may include multimedia messages, such as pictures or videos. == History == Email was first developed to send messages between different computers on ARPANET in 1971. Access to ARPANET was primarily limited to universities and other research institutions. Starting in 1983 or 1984, FidoNet allowed home computer users to send and receive email via bulletin board systems. Information services such as CompuServe, America Online, and Prodigy also helped to popularizes online messaging. The advent of the public World Wide Web in 1993 increased access to email via internet service providers, and later via webmail. Instant messaging systems became popular in the mid 1990s, as Internet access improved and personal computers became more common. The introduction of Skype in 2003 popularized Internet-based voice and video messaging. Direct messaging is now a feature of all major social networking services. == Privacy concerns == In January 2014, Matthew Campbell and Michael Hurley filed a class-action lawsuit against Facebook for breaching the Electronic Communications Privacy Act. They alleged that private messages which contained URLs were being read and used to generate profit, through data mining and user profiling, and that it was misleading for Facebook to refer to the functionality as "private" with the implication that the communication was "free from surveillance". In 2012, some Facebook users misinterpreted a redesign of the Facebook wall as publicly sharing private messages from 2008–2009. These were found to be public wall posts from those years, made at a time when it was not possible to like or comment on a wall post, making the notes look like private messages.

Cipher

In cryptography, a cipher (or cypher) is an algorithm for performing encryption or decryption—a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. To encipher or encode is to convert information into cipher or code. In common parlance, "cipher" is synonymous with "code", as they are both a set of steps that encrypt a message; however, the concepts are distinct in cryptography, especially classical cryptography. Codes generally substitute different length strings of characters in the output, while ciphers generally substitute the same number of characters as are input. A code maps one meaning with another. Words and phrases can be coded as letters or numbers. Codes typically have direct meaning from input to key. Codes primarily function to save time. Ciphers are algorithmic. The given input must follow the cipher's process to be solved. Ciphers are commonly used to encrypt written information. Codes operated by substituting according to a large codebook which linked a random string of characters or numbers to a word or phrase. For example, "UQJHSE" could be the code for "Proceed to the following coordinates.". When using a cipher the original information is known as plaintext, and the encrypted form as ciphertext. The ciphertext message contains all the information of the plaintext message, but is not in a format readable by a human or computer without the proper mechanism to decrypt it. The operation of a cipher usually depends on a piece of auxiliary information, called a key (or, in traditional NSA parlance, a cryptovariable). The encrypting procedure is varied depending on the key, which changes the detailed operation of the algorithm. A key must be selected before using a cipher to encrypt a message, with some exceptions such as ROT13 and Atbash. Most modern ciphers can be categorized in several ways: By whether they work on blocks of symbols usually of a fixed size (block ciphers), or on a continuous stream of symbols (stream ciphers). By whether the same key is used for both encryption and decryption (symmetric key algorithms), or if a different key is used for each (asymmetric key algorithms). If the algorithm is symmetric, the key must be known to the recipient and sender and to no one else. If the algorithm is an asymmetric one, the enciphering key is different from, but closely related to, the deciphering key. If one key cannot be deduced from the other, the asymmetric key algorithm has the public/private key property and one of the keys may be made public without loss of confidentiality. == Etymology == Originating from the Sanskrit word for zero शून्य (śuṇya), via the Arabic word صفر (ṣifr), the word "cipher" spread to Europe as part of the Arabic numeral system during the Middle Ages. The Roman numeral system lacked the concept of zero, and this limited advances in mathematics. In this transition, the word was adopted into Medieval Latin as cifra, and then into Middle French as cifre. This eventually led to the English word cipher (also spelt cypher). One theory for how the term came to refer to encoding is that the concept of zero was confusing to Europeans, and so the term came to refer to a message or communication that was not easily understood. The term cipher was later also used to refer to any Arabic digit, or to calculation using them, so encoding text in the form of Arabic numerals is literally converting the text to "ciphers". == Versus codes == In casual contexts, "code" and "cipher" can typically be used interchangeably; however, the technical usages of the words refer to different concepts. Codes contain meaning; words and phrases are assigned to numbers or symbols, creating a shorter message. An example of this is the commercial telegraph code which was used to shorten long telegraph messages which resulted from entering into commercial contracts using exchanges of telegrams. Another example is given by whole word ciphers, which allow the user to replace an entire word with a symbol or character, much like the way written Japanese utilizes Kanji (meaning Chinese characters in Japanese) characters to supplement the native Japanese characters representing syllables. An example using English language with Kanji could be to replace "The quick brown fox jumps over the lazy dog" by "The quick brown 狐 jumps 上 the lazy 犬". Stenographers sometimes use specific symbols to abbreviate whole words. Ciphers, on the other hand, work at a lower level: the level of individual letters, small groups of letters, or, in modern schemes, individual bits and blocks of bits. Some systems used both codes and ciphers in one system, using superencipherment to increase the security. In some cases the terms codes and ciphers are used synonymously with substitution and transposition, respectively. Historically, cryptography was split into a dichotomy of codes and ciphers, while coding had its own terminology analogous to that of ciphers: "encoding, codetext, decoding" and so on. However, codes have a variety of drawbacks, including susceptibility to cryptanalysis and the difficulty of managing a cumbersome codebook. Because of this, codes have fallen into disuse in modern cryptography, and ciphers are the dominant technique. == Types == There are a variety of different types of encryption. Algorithms used earlier in the history of cryptography are substantially different from modern methods, and modern ciphers can be classified according to how they operate and whether they use one or two keys. === Historical === The Caesar Cipher is one of the earliest known cryptographic systems. Julius Caesar used a cipher that shifts the letters in the alphabet in place by three and wrapping the remaining letters to the front to write to Marcus Tullius Cicero in approximately 50 BC. Historical pen and paper ciphers used in the past are sometimes known as classical ciphers. They include simple substitution ciphers (such as ROT13) and transposition ciphers (such as a Rail Fence Cipher). For example, "GOOD DOG" can be encrypted as "PLLX XLP" where "L" substitutes for "O", "P" for "G", and "X" for "D" in the message. Transposition of the letters "GOOD DOG" can result in "DGOGDOO". These simple ciphers and examples are easy to crack, even without plaintext-ciphertext pairs. In the 1640s, the Parliamentarian commander, Edward Montagu, 2nd Earl of Manchester, developed ciphers to send coded messages to his allies during the English Civil War. The English theologian John Wilkins published a book in 1641 titled "Mercury, or The Secret and Swift Messenger" and described a musical cipher wherein letters of the alphabet were substituted for music notes. This species of melodic cipher was depicted in greater detail by author Abraham Rees in his book Cyclopædia (1778). Simple ciphers were replaced by polyalphabetic substitution ciphers (such as the Vigenère) which changed the substitution alphabet for every letter. For example, "GOOD DOG" can be encrypted as "PLSX TWF" where "L", "S", and "W" substitute for "O". With even a small amount of known or estimated plaintext, simple polyalphabetic substitution ciphers and letter transposition ciphers designed for pen and paper encryption are easy to crack. It is possible to create a secure pen and paper cipher based on a one-time pad, but these have other disadvantages. During the early twentieth century, electro-mechanical machines were invented to do encryption and decryption using transposition, polyalphabetic substitution, and a kind of "additive" substitution. In rotor machines, several rotor disks provided polyalphabetic substitution, while plug boards provided another substitution. Keys were easily changed by changing the rotor disks and the plugboard wires. Although these encryption methods were more complex than previous schemes and required machines to encrypt and decrypt, other machines such as the British Bombe were invented to crack these encryption methods. === Modern === Modern encryption methods can be divided by two criteria: by type of key used, and by type of input data. By type of key used ciphers are divided into: symmetric key algorithms (Private-key cryptography), where one same key is used for encryption and decryption, and asymmetric key algorithms (Public-key cryptography), where two different keys are used for encryption and decryption. In a symmetric key algorithm (e.g., DES and AES), the sender and receiver must have a shared key set up in advance and kept secret from all other parties; the sender uses this key for encryption, and the receiver uses the same key for decryption. The design of AES (Advanced Encryption System) was beneficial because it aimed to overcome the flaws in the design of the DES (Data encryption standard). AES's designer's claim that the common means of modern cipher cryptanalytic attacks are ineffective against AES due to its design structure. Ciphers can be distinguished into two types by the type o

Interplanetary Internet

The interplanetary Internet is a conceived computer network in space, consisting of a set of network nodes that can communicate with each other. These nodes are the planet's orbiters and landers, and the Earth ground stations. For example, the orbiters collect the scientific data from the Curiosity rover on Mars through near-Mars communication links, transmit the data to Earth through direct links from the Mars orbiters to the Earth ground stations via the NASA Deep Space Network, and finally the data routed through Earth's internal internet. Interplanetary communication is greatly delayed by interplanetary distances, as data transmission can only go as fast as the speed of light, so a new set of protocols and technologies that are tolerant to large delays and errors are required. The interplanetary Internet has been envisioned as a store and forward network of internets that is often disconnected, has a wireless backbone fraught with error-prone links and delays ranging from tens of minutes to even hours, even when there is a connection. As of 2024 agencies and companies working towards bringing the network to fruition include NASA, ESA, SpaceX and Blue Origin. == Challenges and reasons == In the core implementation of Interplanetary Internet, satellites orbiting a planet communicate to other planet's satellites. Simultaneously, these planets revolve around the Sun with long distances, and thus many challenges face the communications. The reasons and the resultant challenges are: The motion and long distances between planets: The interplanetary communication is greatly delayed due to the interplanetary distances and the motion of the planets. The delay is variable and long, ranging from a couple of minutes (Earth-to-Mars), to a couple of hours (Pluto-to-Earth), depending on their relative positions. The interplanetary communication also suspends due to the solar conjunction, when the sun's radiation hinders the direct communication between the planets. As such, the communication characterizes lossy links and intermittent link connectivity. Low embeddable payload: Satellites can only carry a small payload, which poses challenges to the power, mass, size, and cost for communication hardware design. An asymmetric bandwidth would be the result of this limitation. This asymmetry reaches ratios up to 1000:1 as downlink:uplink bandwidth portion. Absence of fixed infrastructure: The graph of participating nodes in a specific planet-to-planet communication keeps changing over time, due to the constant motion. The routes of the planet-to-planet communication are planned and scheduled rather than being opportunistic. The Interplanetary Internet design must address these challenges to operate successfully and achieve good communication with other planets. It also must use the few available resources efficiently in the system. == Development == Space communication technology has steadily evolved from expensive, one-of-a-kind point-to-point architectures, to the re-use of technology on successive missions, to the development of standard protocols agreed upon by space agencies of many countries. This last phase has gone on since 1982 through the efforts of the Consultative Committee for Space Data Systems (CCSDS), a body composed of the major space agencies of the world. It has 11 member agencies, 32 observer agencies, and over 119 industrial associates. The evolution of space data system standards has gone on in parallel with the evolution of the Internet, with conceptual cross-pollination where fruitful, but largely as a separate evolution. Since the late 1990s, familiar Internet protocols and CCSDS space link protocols have integrated and converged in several ways; for example, the successful FTP file transfer to Earth-orbiting STRV 1B on January 2, 1996, which ran FTP over the CCSDS IPv4-like Space Communications Protocol Specifications (SCPS) protocols. Internet Protocol use without CCSDS has taken place on spacecraft, e.g., demonstrations on the UoSAT-12 satellite, and operationally on the Disaster Monitoring Constellation. Having reached the era where networking and IP on board spacecraft have been shown to be feasible and reliable, a forward-looking study of the bigger picture was the next phase. The Interplanetary Internet study at NASA's Jet Propulsion Laboratory (JPL) was started by a team of scientists at JPL led by internet pioneer Vinton Cerf and the late Adrian Hooke. Cerf was appointed as a distinguished visiting scientist at JPL in 1998, while Hooke was one of the founders and directors of CCSDS. While IP-like SCPS protocols are feasible for short hops, such as ground station to orbiter, rover to lander, lander to orbiter, probe to flyby, and so on, delay-tolerant networking is needed to get information from one region of the Solar System to another. It becomes apparent that the concept of a region is a natural architectural factoring of the Interplanetary Internet. A region is an area where the characteristics of communication are the same. Region characteristics include communications, security, the maintenance of resources, perhaps ownership, and other factors. The Interplanetary Internet is a "network of regional internets". What is needed then, is a standard way to achieve end-to-end communication through multiple regions in a disconnected, variable-delay environment using a generalized suite of protocols. Examples of regions might include the terrestrial Internet as a region, a region on the surface of the Moon or Mars, or a ground-to-orbit region. The recognition of this requirement led to the concept of a "bundle" as a high-level way to address the generalized Store-and-Forward problem. Bundles are an area of new protocol development in the upper layers of the OSI model, above the Transport Layer with the goal of addressing the issue of bundling store-and-forward information so that it can reliably traverse radically dissimilar environments constituting a "network of regional internets". Delay-tolerant networking (DTN) was designed to enable standardized communications over long distances and through time delays. At its core is the Bundle Protocol (BP), which is similar to the Internet Protocol, or IP, that serves as the heart of the Internet here on Earth. The big difference between the regular Internet Protocol (IP) and the Bundle Protocol is that IP assumes a seamless end-to-end data path, while BP is built to account for errors and disconnections — glitches that commonly plague deep-space communications. Bundle Service Layering, implemented as the Bundling protocol suite for delay-tolerant networking, will provide general-purpose delay-tolerant protocol services in support of a range of applications: custody transfer, segmentation and reassembly, end-to-end reliability, end-to-end security, and end-to-end routing among them. The Bundle Protocol was first tested in space on the UK-DMC satellite in 2008. An example of one of these end-to-end applications flown on a space mission is the CCSDS File Delivery Protocol (CFDP), used on the Deep Impact comet mission. CFDP is an international standard for automatic, reliable file transfer in both directions. CFDP should not be confused with Coherent File Distribution Protocol, which has the same acronym and is an IETF-documented experimental protocol for rapidly deploying files to multiple targets in a highly networked environment. In addition to reliably copying a file from one entity (such as a spacecraft or ground station) to another entity, CFDP has the capability to reliably transmit arbitrarily small messages defined by the user, in the metadata accompanying the file, and to reliably transmit commands relating to file system management that are to be executed automatically on the remote end-point entity (such as a spacecraft) upon successful reception of a file. == Protocol == The Consultative Committee for Space Data Systems (CCSDS) packet telemetry standard defines the protocol used for the transmission of spacecraft instrument data over the deep-space channel. Under this standard, an image or other data sent from a spacecraft instrument is transmitted using one or more packets. === CCSDS packet definition === A packet is a block of data with length that can vary between successive packets, ranging from 7 to 65,542 bytes, including the packet header. Packetized data is transmitted via frames, which are fixed-length data blocks. The size of a frame, including frame header and control information, can range up to 2048 bytes. Packet sizes are fixed during the development phase. Because packet lengths are variable but frame lengths are fixed, packet boundaries usually do not coincide with frame boundaries. === Telecom processing notes === Data in a frame is typically protected from channel errors by error-correcting codes. Even when the channel errors exceed the correction capability of the error-correcting code, the presence of errors is nearly always detected by the e

Echo Lake (software)

Echo Lake (AKA Family Album Creator) was the most notable multimedia software product produced by Delrina, which debuted in June 1995. It was touted internally as a "cross [of] Quark Xpress and Myst". It featured an immersive 3D environment where a user could go to a virtual desktop in a virtual office and assemble video and audio clips along with images, and then print them out as either a virtual book other users of the program could use, or for print. It was a highly innovative product for its time, and ultimately was hampered by the inability of many users able to input their own multimedia content easily into a computer from that period. Creative Wonders bought the rights to the Echo Lake multimedia product, which was re-shaped as an introductory program on multimedia and re-released as Family Album Creator in 1996.

Majal (organization)

Majal is a regional not-for-profit organization focused on "amplifying voices of dissent" throughout the Middle East and North Africa via digital media. Founded in Bahrain, the organization "creates platforms and web applications that promote freedom of expression and social justice." Majal, which relies on open source platforms, like WordPress and Ruby on Rails, was launched in 2006 by Esra'a Al Shafei as a simple group-blogging idea. However, it has changed course to focus on the development of unique applications and tools. == Objectives and means == Majal's content, in addition to its projects and applications, is free open source content to ensure right to access information for everyone. The organization uses a broad spectrum of social media tools, ranging from written blogs, podcasts, vlogs, comics, video animation and pictures to live broadcasting through radio. == Projects and applications == Majal runs various active projects that include Alliance for Kurdish Rights, The Muslim Network for Baháʼí Rights, a discussion tool for Arab LGBT youth and various Mobile apps. == Funding == Majal is funded through private donations and grants from non-governmental organizations, as well as any potential revenues earned through freelance development. Its primary funders are the Shuttleworth Foundation and the Omidyar Network. In 2008, Majal won the Berkman Award from the Berkman Klein Center for Internet & Society at Harvard University in the Human Rights/Global Advocacy category. This $10,000 award was Majal’s first source of funding. This award is presented to “people or institutions that have made a significant contribution to the Internet and its impact on society over the past decade.” In 2009, the March 18 Movement, a project of Majal, received the Think Social Award, which demonstrates how social media can be used to solve the world’s problems. Esra'a Al-Shafei was named a 2009 Echoing Green Fellow for Civil and Human Rights, a seed funding award for young entrepreneurs engaged in social change. Financially, the fellowship consists of a $60,000 stipend paid over two years. Most recently, MEY has received a grant from the Arab Fund for Arts and Culture for its Mideast Tunes website. == Awards == Winner of Human Rights Tulip 2014 Human Rights Tulip - Human rights - Government.nl Ashoka Changemakers Citizen Media competition in 2011 for their CrowdVoice project. Monaco Media Prize 2011 for Majal founder and director Esra'a Al Shafei in 2011. The BOBS Special Topic Human Rights award in 2011 for the Majal website Migrant Rights. ThinkSocial Award in 2009, as powerful model for how social media can be used to address global problems. Echoing Green, 2009 Fellowship. TEDGlobal 2009 Fellowship. Berkman Award for Internet Innovation from Berkman Klein Center for Internet & Society at Harvard Law School in 2008 for the outstanding contributions to the internet and its impact on society. The Global Journal selected Majal as one of the Top 100 NGOs in 2013. 2013-2014 Shuttleworth Foundation Fellowship. == Leadership == Majal team is led primarily by women. The organization was founded by Esra'a Al Shafei, a blogger from Bahrain in 2006. Ahmed Zidan of Egypt has served for over three years as the Editor-in-Chief of Majal Arabic, and is the co-founder of Ahwaa, and is also a podcaster. Other team members include Mona Kareem, Rima Kalush, Abir Ghattas, Namita Malhotra, and Vani Saraswathi. == 2011 Middle East and North Africa protests == Blogs and video played a role in the documentation of protests throughout the Middle East and North Africa during 2010-2011, also known as the Arab Spring. During this period, MEY's project, CrowdVoice (launched in 2010) helped curate and archive the large amounts of videos, images, and eye-witness reports being aggregated and crowdsourced from across the region. As a result, it had been censored temporarily in Yemen and is still censored in Bahrain. == Media coverage == Majal claims to have received various coverage from news agencies, TV satellite channels, radio stations, newspapers, magazines. For instance, Sky News, CNN, New York Times, BBC, The Guardian, NPR, Time, MTV political blog "Act", VH1, Daily Telegraph, Die Zeit, Frankfurter Rundschau FR-online, Toronto Star, TechCrunch, Rolling Stone Middle East, Abu Dhabi TV, Gulf News, Al-Hasnaa' magazine, ReadWriteWeb, Mashable, The Next Web, Radio Sawt Beirut International, Radio Farda among many others.