AI Analytics Football

AI Analytics Football — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Amália (LLM)

    Amália (LLM)

    Amália is a Portuguese large language model (LLM) announced in November 2024 by the Portuguese Prime-Minister Luís Montenegro. Its final version is expected to be launched in 2026. It is being developed by Center for Responsible AI (Centro para a AI Responsável) and by the research centers of NOVA School of Science and Technology and Instituto Superior Técnico. == History == In 2024 it was announced that the Portuguese Agency for Administrative Modernization (Agência para a Modernização Administrativa) transpose this LLM to Portuguese Public Administration. According to Paulo Dimas (CEO of the Center for Responsible AI) the three fundamental points of this LLM project are the linguistic variant (European Portuguese), cultural representation and data protection. In April 2025 it was announced that Amália had entered beta phase with an improved version being expected to be launched in September 2025. The beta version released in September is available only to the Public Administration, but the website launched in October reiterates the final version will be an open model.

    Read more →
  • Brooklyn Bridge (software)

    Brooklyn Bridge (software)

    The Brooklyn Bridge from White Crane Systems was a data transfer enabler. Although it came with some hardware, it was the software which was the basis of the product. It also could transform the data's format. == Overview == The New York Times described its category as being among "communications packages used to transfer files." In an era of 300 baud, Brooklyn Bridge operated at "115,200 baud" so that a transfer which "at 300 baud took 4 minutes and 36 seconds" only needed 5 seconds. Unlike some communications packages, this one retains the original version-date, so as not to alarm people when they seem to have what looks like an update, when it's not. == Description == Once the software is installed, users comfortable with typing the word "COPY" can do so as readily as they sneakernet. An earlier review described it as "less cumbersome than conventional communications software" The use of neither specialized hardware nor specialized software is ideal in an era when this can be done using online or other "outside" services.

    Read more →
  • Signals intelligence

    Signals intelligence

    Signals intelligence (SIGINT) is the act and field of intelligence-gathering by interception of signals, whether communications between people (communications intelligence—abbreviated to COMINT) or from electronic signals not directly used in communication (electronic intelligence—abbreviated to ELINT). As classified and sensitive information is usually encrypted, signals intelligence may necessarily involve cryptanalysis (to decipher the messages). Traffic analysis—the study of who is signaling to whom and in what quantity—is also used to integrate information, and it may complement cryptanalysis. == History == === Origins === Electronic interceptions appeared as early as 1900, during the Boer War of 1899–1902. The British Royal Navy had installed wireless sets produced by Marconi on board their ships in the late 1890s, and the British Army used some limited wireless signalling. The Boers captured some wireless sets and used them to make vital transmissions. Since the British were the only people transmitting at the time, the British did not need special interpretation of the signals that they were. The birth of signals intelligence in a modern sense dates from the Russo-Japanese War of 1904–1905. As the Russian fleet prepared for conflict with Japan in 1904, the British ship HMS Diana stationed in the Suez Canal intercepted Russian naval wireless signals being sent out for the mobilization of the fleet, for the first time in history. === Development in World War I === Over the course of the First World War, a new method of signals intelligence reached maturity. Russia's failure to properly protect its communications fatally compromised the Russian Army's advance early in World War I and led to their disastrous defeat by the Germans under Ludendorff and Hindenburg at the Battle of Tannenberg. In 1918, French intercept personnel captured a message written in the new ADFGVX cipher, which was cryptanalyzed by Georges Painvin. This gave the Allies advance warning of the German 1918 Spring Offensive. The British in particular, built up great expertise in the newly emerging field of signals intelligence and codebreaking (synonymous with cryptanalysis). On the declaration of war, Britain cut all German undersea cables. This forced the Germans to communicate exclusively via either (A) a telegraph line that connected through the British network and thus could be tapped; or (B) through radio which the British could then intercept. Rear Admiral Henry Oliver appointed Sir Alfred Ewing to establish an interception and decryption service at the Admiralty; Room 40. An interception service known as 'Y' service, together with the post office and Marconi stations, grew rapidly to the point where the British could intercept almost all official German messages. The German fleet was in the habit each day of wirelessing the exact position of each ship and giving regular position reports when at sea. It was possible to build up a precise picture of the normal operation of the High Seas Fleet, to infer from the routes they chose where defensive minefields had been placed and where it was safe for ships to operate. Whenever a change to the normal pattern was seen, it immediately signalled that some operation was about to take place, and a warning could be given. Detailed information about submarine movements was also available. The use of radio-receiving equipment to pinpoint the location of any single transmitter was also developed during the war. Captain H.J. Round, working for Marconi, began carrying out experiments with direction-finding radio equipment for the army in France in 1915. By May 1915, the Admiralty was able to track German submarines crossing the North Sea. Some of these stations also acted as 'Y' stations to collect German messages, but a new section was created within Room 40 to plot the positions of ships from the directional reports. Room 40 played an important role in several naval engagements during the war, notably in detecting major German sorties into the North Sea. The battle of Dogger Bank was won in no small part due to the intercepts that allowed the Navy to position its ships in the right place. It played a vital role in subsequent naval clashes, including at the Battle of Jutland as the British fleet was sent out to intercept them. The direction-finding capability allowed for the tracking and location of German ships, submarines, and Zeppelins. The system was so successful that by the end of the war, over 80 million words, comprising the totality of German wireless transmission over the course of the war, had been intercepted by the operators of the Y-stations and decrypted. However, its most astonishing success was in decrypting the Zimmermann Telegram, a telegram from the German Foreign Office sent via Washington to its ambassador Heinrich von Eckardt in Mexico. === Postwar consolidation === With the importance of interception and decryption firmly established by the wartime experience, countries established permanent agencies dedicated to this task in the interwar period. In 1919, the British Cabinet's Secret Service Committee, chaired by Lord Curzon, recommended that a peace-time codebreaking agency should be created. The Government Code and Cypher School (GC&CS) was the first peace-time codebreaking agency, with a public function "to advise as to the security of codes and cyphers used by all Government departments and to assist in their provision", but also with a secret directive to "study the methods of cypher communications used by foreign powers". GC&CS officially formed on 1 November 1919, and produced its first decrypt on 19 October. By 1940, GC&CS was working on the diplomatic codes and ciphers of 26 countries, tackling over 150 diplomatic cryptosystems. The US Cipher Bureau was established in 1919 and achieved some success at the Washington Naval Conference in 1921, through cryptanalysis by Herbert Yardley. Secretary of War Henry L. Stimson closed the US Cipher Bureau in 1929 with the words "Gentlemen do not read each other's mail." === World War II === The use of SIGINT had even greater implications during World War II. The combined effort of intercepts and cryptanalysis for the whole of the British forces in World War II came under the code name "Ultra", managed from Government Code and Cypher School at Bletchley Park. Properly used, the German Enigma and Lorenz ciphers should have been virtually unbreakable, but flaws in German cryptographic procedures, and poor discipline among the personnel carrying them out, created vulnerabilities which made Bletchley's attacks feasible. Bletchley's work was essential to defeating the U-boats in the Battle of the Atlantic, and to the British naval victories in the Battle of Cape Matapan and the Battle of North Cape. In 1941, Ultra exerted a powerful effect on the North African desert campaign against German forces under General Erwin Rommel. General Sir Claude Auchinleck wrote that were it not for Ultra, "Rommel would have certainly got through to Cairo". Ultra decrypts featured prominently in the story of Operation SALAM, László Almásy's mission across the desert behind Allied lines in 1942. Prior to the Normandy landings on D-Day in June 1944, the Allies knew the locations of all but two of Germany's fifty-eight Western Front divisions. Winston Churchill was reported to have told King George VI: "It is thanks to the secret weapon of General Menzies, put into use on all the fronts, that we won the war!" Supreme Allied Commander, Dwight D. Eisenhower, at the end of the war, described Ultra as having been "decisive" to Allied victory. Official historian of British Intelligence in World War II Sir Harry Hinsley argued that Ultra shortened the war "by not less than two years and probably by four years"; and that, in the absence of Ultra, it is uncertain how the war would have ended. At a lower level, German cryptanalysis, direction finding, and traffic analysis were vital to Rommel's early successes in the Western Desert Campaign until British forces tightened their communications discipline and Australian raiders destroyed his principal SIGINT Company. == Technical definitions == The United States Department of Defense has defined the term "signals intelligence" as: A category of intelligence comprising either individually or in combination all communications intelligence (COMINT), electronic intelligence (ELINT), and foreign instrumentation signals intelligence (FISINT), however transmitted. Intelligence derived from communications, electronic, and foreign instrumentation signals. Being a broad field, SIGINT has many sub-disciplines. The two main ones are communications intelligence (COMINT) and electronic intelligence (ELINT). == Disciplines shared across the branches == === Targeting === A collection system has to know to look for a particular signal. "System", in this context, has several nuances. Targeting is the process of developing collection requirements: "1. A

    Read more →
  • Social media newsroom

    Social media newsroom

    A social media newsroom is a company resource, set up to increase the functionality and usability of the traditional online newsroom. Social media newsrooms (SMNs) are intended to encourage dialogue and information sharing. Unlike online newsrooms, content is accessible to more than just journalists, but to all those with whom the company engages such as bloggers, their prospects, customers, business partners and investors. It gives these stakeholders access to news, public relations announcements, images, audio, video and other multimedia files. In addition to posting press releases and corporate news, companies can integrate other social content from sites such as YouTube, Flickr and Slideshow as well as streams from corporate Twitter accounts. Traditional tools for journalists such as corporate fast facts, leadership information, a multimedia library, financial information, awards and other recent media coverage are also included in an SMN. Examples of companies effectively using social media newsrooms include Opel Group, Pressat, First Direct, MyNewsdesk, Scania and Newport Beach.

    Read more →
  • OpenIO

    OpenIO

    OpenIO offered object storage for a wide range of high-performance applications. OpenIO was founded in 2015 by Laurent Denel (CEO), Jean-François Smigielski (CTO) and five other co-founders; it leveraged open source software, developed since 2006, based on a grid technology that enabled dynamic behaviour and supported heterogenous hardware. In October 2017 OpenIO was completed a $5 million funding rounds. In July 2020 OpenIO had been acquired by OVH and withdrawn from the market to become the core technology of OVHcloud object storage offering. == Software == OpenIO is a software-defined object store that supports S3 and can be deployed on-premises, cloud-hosted or at the edge, on any hardware mix. It has been designed from the beginning for performance and cost-efficiency at any scale, and it has been optimized for Big Data, HPC and AI. OpenIO stores objects within a flat structure within a massively distributed directory with indirections, which allows the data query path to be independent of the number of nodes and the performance not to be affected by the growth of capacity. Servers are organized as a grid of nodes massively distributed, where each node takes part in directory and storage services, which ensures that there is no single point of failure and that new nodes are automatically discovered and immediately available without the need to rebalance data. The software is built on top of a technology that ensures optimal data placement based on real-time metrics and allows the addition or removal of storage devices with automatic performance and load impact optimization. For data protection OpenIO has synchronous and asynchronous replication with multiple copies, and an erasure coding implementation based on Reed-Solomon that can be deployed in one data center or geo-distributed or stretched clusters. The software has a feature that catches all events that occur in the cluster and can pass them up in the stack or to applications running on OpenIO nodes. This enables event-driven computing directly into the storage infrastructure. The open source code is available on Github and it is licensed under AGPL3 for server code and LGPL3 for client code. == Performance == OpenIO claimed in 2019 to have reached 1.372 Tbit/s write speed (171 GB/s) on a cluster of 350 physical machines. The benchmark scenario, conducted under production conditions with standard hardware (commodity servers with 7200 rpm HDDs), consisted in backing up a 38 PB Hadoop datalake via the DistCp command. This level of performance marked, according to analysts, the arrival of a new generation of object storage technologies oriented toward high performance and hyper-scalability.

    Read more →
  • Tokenization (data security)

    Tokenization (data security)

    Tokenization, when applied to data security, is the process of substituting a sensitive data element with a non-sensitive equivalent, referred to as a token, that has no intrinsic or exploitable meaning or value. The token is a reference (i.e. identifier) that maps back to the sensitive data through a tokenization system. The mapping from original data to a token uses methods that render tokens infeasible to reverse in the absence of the tokenization system, for example using tokens created from random numbers. A one-way cryptographic function is used to convert the original data into tokens, making it difficult to recreate the original data without obtaining entry to the tokenization system's resources. To deliver such services, the system maintains a vault database of tokens that are connected to the corresponding sensitive data. Protecting the system vault is vital to the system, and improved processes must be put in place to offer database integrity and physical security. The tokenization system must be secured and validated using security best practices applicable to sensitive data protection, secure storage, audit, authentication and authorization. The tokenization system provides data processing applications with the authority and interfaces to request tokens, or detokenize back to sensitive data. The security and risk reduction benefits of tokenization require that the tokenization system is logically isolated and segmented from data processing systems and applications that previously processed or stored sensitive data replaced by tokens. Only the tokenization system can tokenize data to create tokens, or detokenize back to redeem sensitive data under strict security controls. The token generation method must be proven to have the property that there is no feasible means through direct attack, cryptanalysis, side channel analysis, token mapping table exposure or brute force techniques to reverse tokens back to live data. Replacing live data with tokens in systems is intended to minimize exposure of sensitive data to those applications, stores, people and processes, reducing risk of compromise or accidental exposure and unauthorized access to sensitive data. Applications can operate using tokens instead of live data, with the exception of a small number of trusted applications explicitly permitted to detokenize when strictly necessary for an approved business purpose. Tokenization systems may be operated in-house within a secure isolated segment of the data center, or as a service from a secure service provider. Tokenization may be used to safeguard sensitive data involving, for example, bank accounts, financial statements, medical records, criminal records, driver's licenses, loan applications, stock trades, voter registrations, and other types of personally identifiable information (PII). Tokenization is often used in credit card processing. The PCI Council defines tokenization as "a process by which the primary account number (PAN) is replaced with a surrogate value called a token. A PAN may be linked to a reference number through the tokenization process. In this case, the merchant simply has to retain the token and a reliable third party controls the relationship and holds the PAN. The token may be created independently of the PAN, or the PAN can be used as part of the data input to the tokenization technique. The communication between the merchant and the third-party supplier must be secure to prevent an attacker from intercepting to gain the PAN and the token. De-tokenization is the reverse process of redeeming a token for its associated PAN value. The security of an individual token relies predominantly on the infeasibility of determining the original PAN knowing only the surrogate value". The choice of tokenization as an alternative to other techniques such as encryption will depend on varying regulatory requirements, interpretation, and acceptance by respective auditing or assessment entities. This is in addition to any technical, architectural or operational constraint that tokenization imposes in practical use. == Concepts and origins == The concept of tokenization, as adopted by the industry today, has existed since the first currency systems emerged centuries ago as a means to reduce risk in handling high value financial instruments by replacing them with surrogate equivalents. In the physical world, coin tokens have a long history of use replacing the financial instrument of minted coins and banknotes. In more recent history, subway tokens and casino chips found adoption for their respective systems to replace physical currency and cash handling risks such as theft. Exonumia and scrip are terms synonymous with such tokens. In the digital world, similar substitution techniques have been used since the 1970s as a means to isolate real data elements from exposure to other data systems. In databases for example, surrogate key values have been used since 1976 to isolate data associated with the internal mechanisms of databases and their external equivalents for a variety of uses in data processing. More recently, these concepts have been extended to consider this isolation tactic to provide a security mechanism for the purposes of data protection. In the payment card industry, tokenization is one means of protecting sensitive cardholder data in order to comply with industry standards and government regulations. Tokenization was applied to payment card data by Shift4 Corporation and released to the public during an industry Security Summit in Las Vegas, Nevada in 2005. The technology is meant to prevent the theft of the credit card information in storage. Shift4 defines tokenization as: "The concept of using a non-decryptable piece of data to represent, by reference, sensitive or secret data. In payment card industry (PCI) context, tokens are used to reference cardholder data that is managed in a tokenization system, application or off-site secure facility." To protect data over its full lifecycle, tokenization is often combined with end-to-end encryption to secure data in transit to the tokenization system or service, with a token replacing the original data on return. For example, to avoid the risks of malware stealing data from low-trust systems such as point of sale (POS) systems, as in the Target breach of 2013, cardholder data encryption must take place prior to card data entering the POS and not after. Encryption takes place within the confines of a security hardened and validated card reading device and data remains encrypted until received by the processing host, an approach pioneered by Heartland Payment Systems as a means to secure payment data from advanced threats, now widely adopted by industry payment processing companies and technology companies. The PCI Council has also specified end-to-end encryption (certified point-to-point encryption—P2PE) for various service implementations in various PCI Council Point-to-point Encryption documents. == The tokenization process == The process of tokenization consists of the following steps: The application sends the tokenization data and authentication information to the tokenization system. It is stopped if authentication fails and the data is delivered to an event management system. As a result, administrators can discover problems and effectively manage the system. The system moves on to the next phase if authentication is successful. Using one-way cryptographic or random generation techniques, a token is generated and kept in a highly secure data vault. The new token is provided to the application for further use, replacing the sensitive data for processing and storage. Tokenization systems share several components according to established standards. Token generation is the process of producing a token using any means, such as one-way nonreversible cryptographic functions (e.g., a hash function with a strong, secret salt) or assignment via a randomly generated number. Random number generator (RNG) techniques are often the best choice for generating token values. Token mapping – this is the process of assigning the created token value to its original value. To enable permitted look-ups of the original value using the token as the index, a secure cross-reference database must be constructed. Token data store – this is a central repository for the token mapping process that holds the original sensitive values and their related token values. Sensitive data and token values must be securely kept in an encrypted format. Management of cryptographic keys. Strong key management procedures are required for sensitive data encryption on token data stores. == Difference from encryption == Tokenization and "classic" encryption effectively protect data if implemented properly, and a computer security system may use both. While similar in certain regards, tokenization and classic encryption differ in a few key aspects. Both are cryptographic data security methods and the

    Read more →
  • Atomicity (database systems)

    Atomicity (database systems)

    In database systems, atomicity (; from Ancient Greek: ἄτομος, romanized: átomos, lit. 'undividable') is the property of a database transaction consisting of an indivisible and irreducible series of database operations such that either all occur, or none occur. It is one of the ACID transaction properties: Atomicity, Consistency, Isolation, Durability. A guarantee of atomicity prevents partial database updates from occurring, because they can cause greater problems than rejecting the whole series outright. As a consequence, an atomic transaction cannot be observed to be in progress by another database client: at one moment in time, it has not yet happened, and at the next it has already occurred in whole (or nothing happened if the transaction was cancelled in progress). An example of transaction atomicity could be a digital monetary transfer from bank account A to account B. It consists of two operations, debiting the money from account A and crediting it to account B. Performing both of these operations inside of an atomic transaction ensures that the database remains in a consistent state, if either operation fails there will not be any unaccountable credits or debits affecting either account. The same term is also used in the definition of First normal form in database systems, where it instead refers to the concept that the values for fields may not consist of multiple smaller values to be decomposed, such as a string into which multiple names, numbers, dates, or other types may be packed. == Orthogonality == Atomicity does not behave completely orthogonally with regard to the other ACID properties of transactions. For example, isolation relies on atomicity to roll back the enclosing transaction in the event of an isolation violation such as a deadlock; consistency also relies on atomicity to roll back the enclosing transaction in the event of a consistency violation by an illegal transaction. As a result of this, a failure to detect a violation and roll back the enclosing transaction may cause an isolation or consistency failure. == Implementation == Typically, systems implement Atomicity by providing some mechanism to indicate which transactions have started and which finished; or by keeping a copy of the data before any changes occurred (Read-copy-update). Several filesystems have developed methods for avoiding the need to keep multiple copies of data, using journaling (see journaling file system). Databases usually implement this using some form of logging/journaling to track changes. The system synchronizes the logs (often the metadata) as necessary after changes have successfully taken place. Afterwards, crash recovery ignores incomplete entries. Although implementations vary depending on factors such as concurrency issues, the principle of atomicity – i.e. complete success or complete failure – remain. Ultimately, any application-level implementation relies on operating-system functionality. At the file-system level, POSIX-compliant systems provide system calls such as open(2) and flock(2) that allow applications to atomically open or lock a file. At the process level, POSIX Threads provide adequate synchronization primitives. The hardware level requires atomic operations such as Test-and-set, Fetch-and-add, Compare-and-swap, or Load-Link/Store-Conditional, together with memory barriers. Portable operating systems cannot simply block interrupts to implement synchronization, since hardware that lacks concurrent execution such as hyper-threading or multi-processing is now extremely rare. In distributed and sharded databases, atomicity is complicated by network latency and the potential for partial failures. While traditional distributed systems often employ locking protocols (like 2PC) to ensure cross-shard atomicity, these can introduce performance bottlenecks. Recent research into distributed ledger consensus suggests alternative models, such as "braided synchronization". This technique, utilized in protocols like Cerberus, intertwines the consensus phases of multiple shards to enforce atomic guarantees without a global ordering of all transactions.

    Read more →
  • KLJN Secure Key Exchange

    KLJN Secure Key Exchange

    Random-resistor-random-temperature Kirchhoff-law-Johnson-noise key exchange, also known as RRRT-KLJN or simply KLJN, is an approach for distributing cryptographic keys between two parties that claims to offer unconditional security. This claim, which has been contested, is significant, as the only other key exchange approach claiming to offer unconditional security is Quantum key distribution. The KLJN secure key exchange scheme was proposed in 2005 by Laszlo Kish and Granqvist. It has the advantage over quantum key distribution in that it can be performed over a metallic wire with just four resistors, two noise generators, and four voltage measuring devices---equipment that is low-priced and can be readily manufactured. It has the disadvantage that several attacks against KLJN have been identified which must be defended against. "Given that the amount of effort and funding that goes into Quantum Cryptography is substantial (some even mock it as a distraction from the ultimate prize which is quantum computing), it seems to me that the fact that classic thermodynamic resources allow for similar inherent security should give one pause," wrote Henning Dekant, the founder of the Quantum Computing Meetup, in April 2013. The Cybersecurity Curricula 2017, a joint project of the Association for Computing Machinery, the IEEE Computer Society, the Association for Information Systems, and the International Federation for Information Processing Technical Committee on Information Security Education (IFIP WG 11.8) recommends teaching the KLJN Scheme as part of teaching "Advanced concepts" in its knowledge unit on cryptography. == See Also/Further Reading ==

    Read more →
  • Doubao

    Doubao

    Doubao (Chinese: 豆包) is an artificial intelligence assistant developed by ByteDance. == History == The chatbot was launched in August 2023. By November 2024, it had become China's most popular AI chatbot, with approximately 60 million monthly active users according to industry analytics. == Design == Doubao is powered by Volcano Engine (Volcengine), 120 trillion tokens consumed per day. == Variants == === Dola === The international version of Doubao is Dola which was launched in August 2023 as Cici. Dola is powered by OpenAI's GPT series of large language models and by Google's Gemini.

    Read more →
  • Codebook

    Codebook

    A codebook is a type of document used for gathering and storing cryptography codes. Originally, codebooks were often literally books, but today "codebook" is a byword for the complete record of a series of codes, regardless of physical format. == Cryptography == In cryptography, a codebook is a document used for implementing a code. A codebook contains a lookup table for coding and decoding; each word or phrase has one or more strings which replace it. To decipher messages written in code, corresponding copies of the codebook must be available at either end. The distribution and physical security of codebooks presents a special difficulty in the use of codes compared to the secret information used in ciphers, the key, which is typically much shorter. The United States National Security Agency documents sometimes use codebook to refer to block ciphers; compare their use of combiner-type algorithm to refer to stream ciphers. Codebooks come in two forms, one-part or two-part: In one-part codes, the plaintext words and phrases and the corresponding code words are in the same alphabetical order. They are organized similar to a standard dictionary. Such codes are half the size of two-part codes but are more vulnerable since an attacker who recovers some code word meanings can often infer the meaning of nearby code words. One-part codes may be used simply to shorten messages for transmission or have their security enhanced with superencryption methods, such as adding a secret number to numeric code words. In two-part codes, one part is for converting plaintext to ciphertext, the other for the opposite purpose. They are usually organized similarly to a language translation dictionary, with plaintext words (in the first part) and ciphertext words (in the second part) presented like dictionary headwords. The earliest known use of a codebook system was by Gabriele de Lavinde in 1379 working for the Antipope Clement VII. Two-part codebooks go back as least as far as Antoine Rossignol in the 1800s. From the 15th century until the middle of the 19th century, nomenclators (named after nomenclator) were the most used cryptographic method. Codebooks with superencryption were the most used cryptographic method of World War I. The JN-25 code used in World War II used a codebook of 30,000 code groups superencrypted with 30,000 random additives. The book used in a book cipher or the book used in a running key cipher can be any book shared by sender and receiver and is different from a cryptographic codebook. == Social sciences == In social sciences, a codebook is a document containing a list of the codes used in a set of data to refer to variables and their values, for example locations, occupations, or clinical diagnoses. == Data compression == Codebooks were also used in 19th- and 20th-century commercial codes for the non-cryptographic purpose of data compression. Codebooks are used in relation to precoding and beamforming in mobile networks such as 5G and LTE. The usage is standardized by 3GPP, for example in the document TS 38.331, NR; Radio Resource Control (RRC); Protocol specification.

    Read more →
  • Peñabot

    Peñabot

    Peñabot is the nickname for automated social media accounts allegedly used by the Mexican government of Enrique Peña Nieto and the PRI political party to keep unfavorable news from reaching the Mexican public. Peñabot accusations are related to the broader issue of fake news in the 21st century. == History of disinformation in Mexican politics == The PRI political party has been reported to use fake news since before Peña Nieto. The main tactic originally was to spread such propaganda through open radio and television networks. Such tactic was effective in Mexico, because newspaper readership is low and cable TV is largely limited to the middle classes; consequently, the country's two major television networks – Televisa and TV Azteca – exert a significant influence in national politics. Televisa itself, not only owns around two-thirds of the programming on Mexico's TV channels, making it not only Mexico's largest television network, but also is the largest media network in the Spanish-speaking world. == Peñabots == Analysts have given the name Peñabots to a suspected network of automated accounts on social media used by the Mexican government to spread pro-government propaganda and to marginalize dissenting opinions in social media. The bots were first noticed in the 2012 elections when they were used to disseminate opinions in support of Enrique Peña Nieto on social networks such as Twitter and Facebook. According to Aristegui Noticias, their usage went against articles 6 and 134 of the Mexican Constitution. Those used by Peña Nieto's government cost an estimated 80 million pesos monthly, which news outlets argued only helped the government spread fake support towards the president, but did not have a benefit towards Mexican people (with whom EPN was highly unpopular). Facebook held approximately 640,321 Peñabots, while Twitter had less. As of July 2017, Oxford Internet Institute's Computational Propaganda Research Project claimed many western democracies, Mexico included, perform social media manipulation, thus saying the manipulation comes directly from the Mexican government itself. During Peña Nieto's subsequent presidency, analysts noted that Peñabots were used to overpower trending topics that critiqued government, to flood trending government critical hashtags with spam, to create fake trends by pushing alternative hashtags, and to push smear campaigns and threats against government-critical activists and journalists. Peñabots were distinguished as their pattern of activity was distinct from that of ordinary interaction on social networks. === Meadebots === On Twitter it was reported that about 94% of the followers of 2018 presidential candidate from the PRI Jose Antonio Meade were bots. When Antonio Meade presented himself as a candidate for the 2018 presidential election, his social media accounts such as "@MovimientoMEADE" (created by the PRI's official account @PRI_Nacional), obtained a huge quantity of followers in a short span of time. Some users noticed and brought it to attention, and after investigation it was reported 94% of such followers were bots (702,000 out of 747,000), and the account was eliminated from Twitter after 20 hours. The fake accounts used the hashtags #YoConMeade and #Meade18. It was further revealed was that Meade's official account on Twitter, @JoseAMeadeK had 25% bots (216,000 fake followers out of the 981,000). == Manipulation of news media in Mexico, through television == The Mexican government of Peña Nieto has been accused of using various means to keep unfavorable news from reaching the Mexican people. Many Mexicans have protested this practice as it clearly goes against the freedom of speech. The PRI has been reported to use fake news since before Peña Nieto. The main tactic has been to spread such propaganda through radio and television. This tactic is perceived as effective in Mexico, because newspaper readership is low and research on the Internet and cable TV is largely limited to the middle classes; consequently, the country's two major television networks – Televisa and TV Azteca – exert a significant influence in national politics. Televisa itself, owns around two-thirds of the programming on Mexico's TV channels, making it not only Mexico's largest television network, but also is the largest media network in the Spanish-speaking world. In June 2012, before the 2012 Mexican presidential elections, the British newspaper The Guardian published a series of allegations claiming Televisa, sold favorable coverage to top politicians in its news and entertainment shows, this scandal became known as the Televisa controversy. The documents published by 'The Guardian alleged that a secretive circle within Televisa manipulated news coverage to favor PRI presidential candidate Enrique Peña Nieto, who was poised as favorite to win. Televisa's secret circle supposedly commissioned videos to promote Peña Nieto and lash out his political rivals in 2009. The Guardian documents suggest that Televisa's secret team distributed such videos through e-mail, posting them posted them on Facebook and YouTube, some can still be seen there. Another document was a PowerPoint presentation, with a slide explicitly aimed at rival leftist candidate of the Party of the Democratic Revolution (PRD), Andrés Manuel López Obrador. Supposedly given to The Guardian by a Televisa employee. The document's authenticity was never possible to confirm– however dates, names, and events largely coincide. Televisa refused to talk the documents, and denied a relationship with the PRI or its presidential candidate, saying that they had provided equal media coverage to all parties. Televisa published an article supposedly showing discrepancies in The Guardian documents and denying accusations. Mexican citizens complained about the perceived favoritism towards Enrique Peña Nieto and the PRI, protesting through the Yo Soy 132 movement which Televisa covered in detail. However, Televisa's news media coverage is perceived to have been biased, by using a media coverage tactic Mexican citizens call cortinas de humo (smoke screens). These introduce a news scandal giving extensive coverage to distract citizens from a potential conflict-of-interest or controversy that could damage the image of the politician favored by the network. An example of a perceived smoke screen would be the news media coverage of "Caso Michoacán" and "Caso Paolette" distracting all the attention from the parallel "Yo soy 132" movement. A few years later, on the day of September 11, 2016; factual evidence of Televisa's performing media manipulation emerged, when a Televisa news anchor while live-on air reading a teleprompter, mistakenly read out loud that "try that Jaime "Ël Bronco" Rodríguez Calderón (Nuevo Leon's governor) is mentioned as little as possible". Newspaper El Universal caught it on video and published it social media. Televisa didn't mention the story and declined to comment. Lack of news coverage concerning Nuevo León's Governor Jaime Rodriguez, is perceived due to him being the first elected governor to not be part of any political party (Independent Governor), and because unlike the governors from the PRI preceding him, the independent governor "El Bronco" doesn't spend money on publicity at all, preferring to communicate all news by using social media such as Twitter and Facebook. While the incident may have proven Televisa's bias, there wasn't anything to incriminate the PRI political party or Enrique Peña Nieto, though it did further suspicion of Televisa manipulating news media. In contrast, a December 2017 article of The New York Times, reported Enrique Peña Nieto spending about 2000 million dollars on publicity, during his first 5 years as president, the largest publicity budget ever spent by a Mexican President. Additionally, 68 percent of news journalists admitted to not believe to have enough freedom of speech, and award-winning news reporter Carmen Aristegui was controversially fired shortly after revealing the Mexican White House scandals. == Violence and spying towards news journalists and civil rights activists == Far for only being receiving accusations of spreading fake news, the Mexican government of EPN (Enrique Peña Nieto) has also been accused of violence towards news journalists, and of spying on them, and also towards civil right leaders and their families. During his tenure as president, Peña Nieto has been accused of failing to protect news journalists, whose deaths are speculated to be politically triggered, by politicians attempting to prevent them from covering political scandals. The New York Times published a news report on the matter titled, "In Mexico it's easy to kill a journalist", on it mentioning how during EPN's government, Mexico became one of the worst countries on which to be a journalist. The assassination of journalist Javier Valdez on May 23, 2017, received national coverage, with multiple news journalists

    Read more →
  • Data hub

    Data hub

    A data hub is a center of data exchange that is supported by data science, data engineering, and data warehouse technologies to interact with endpoints such as applications and algorithms. == Features == A data hub differs from a data warehouse in that it is generally unintegrated and often at different grains. It differs from an operational data store because a data hub does not need to be limited to operational data. A data hub differs from a data lake by homogenizing data and possibly serving data in multiple desired formats, rather than simply storing it in one place, and by adding other value to the data such as de-duplication, quality, security, and a standardized set of query services. A data lake tends to store data in one place for availability, and allow/require the consumer to process or add value to the data. Data hubs are ideally the "go-to" place for data within an enterprise, so that many point-to-point connections between callers and data suppliers do not need to be made, and so that the data hub organization can negotiate deliverables and schedules with various data enclave teams, rather than being an organizational free-for-all as different teams try to get new services and features from many other teams.

    Read more →
  • Nuance Communications

    Nuance Communications

    Nuance Communications, Inc. was an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software. Nuance merged with its competitor in the commercial large-scale speech application business, ScanSoft, in October 2005. ScanSoft was a Xerox spin-off that was bought in 1999 by Visioneer, a hardware and software scanner company, which adopted ScanSoft as the new merged company name. The original ScanSoft had its roots in Kurzweil Computer Products. In April 2021, Microsoft announced it would buy Nuance Communications. The deal is an all-cash transaction of $19.7 billion, including company debt, or $56 per share. The acquisition was completed in March 2022. == History == The Speech Technology and Research (STAR) Laboratory at SRI International began the journey that, in 1994, resulted in a spin-off company; Corona Corporation (later renamed to Nuance Communications ). Nuance Communications (NUAN) went public on the Nasdaq Stock Market in 1995. Nuance focused on commercializing advanced speech recognition technologies. Nuance was an early spinoff of SRI's Speech Technology and Research (STAR) Laboratory, a world leader in audio processing, speech and speaker analytics and spoken language research. The technology that served as the foundation of Nuance's speech recognition solution started at the STAR Lab and helped launch Nuance more than 20 years ago. In 1995, The SRI Language Modeling Toolkit (SRILM) was developed. This provides the tools to build and apply statistical language models (LMs), primarily for use in speech recognition, statistical tagging and segmentation, and machine translation. In terms of commercialization of natural automated speech recognition, SRI's natural language speech recognition software was the first to be deployed by a major corporation. In 1996, Charles Schwab & Co., Inc., used Nuance's speech recognition technology to allow customers to receive stock quotes over the telephone. One of the key features of the ‘Schwab Discount Brokerage system’, was the ability to recognize English words even when spoken by customers with accents. In 1997, Nuance Communications developed the first large scale commercial dialog system for United Parcel Services (UPS). UPS used the voice recognition platform to handle very large numbers of inquiries about package status. The company that would later merge with Nuance Communications started life as Visioneer, incorporated in 1992. In 1999, Visioneer acquired ScanSoft, Inc. (SSFT), and the combined company became known as ScanSoft. In September 2005, ScanSoft Inc. acquired and merged with Nuance Communications (NUAN), a natural language DOD-project spinoff from SRI International. The resulting company adopted the Nuance name. During the prior decade, the two companies competed in the commercial large-scale speech application business. === Data breach === Between 2014 and 2017, Nuance exposed over 45,000 patient records. == Solutions == Customer service virtual assistants Speech recognition — for people Speech recognition — for business Speech recognition — for physicians Accessibility Power PDF Managed Print Services Transcription === ScanSoft origins === In 1974, Raymond Kurzweil founded Kurzweil Computer Products, Inc. to develop the first omni-font optical character-recognition system – a computer program capable of recognizing text written in any normal font. In 1980, Kurzweil sold his company to Xerox. The company became known as Xerox Imaging Systems (XIS), and later ScanSoft. In March 1992, a new company called Visioneer, Inc. was founded to develop scanner hardware and software products, such as a sheetfed scanner called PaperMax and the document management software PaperPort. Visioneer eventually sold its hardware division to Primax Electronics, Ltd. in January 1999. Two months later, in March, Visioneer acquired ScanSoft from Xerox to form a new public company with ScanSoft as the new company-wide name. Prior to 2001, ScanSoft focused primarily on desktop imaging software such as TextBridge, PaperPort and OmniPage. Beginning with the December 2001 acquisition of Lernout & Hauspie assets, the company moved into the speech recognition business and began to compete with Nuance. Lernout & Hauspie had acquired speech recognition company Dragon Systems in June 2001, shortly before becoming bankrupt in October. Scansoft acquired speech recognition company SpeechWorks in 2003. === Partnership with Siri and Apple Inc. === In 2013, Nuance confirmed that its natural language processing algorithms supported Apple's Siri voice assistant. === Focus on health care === In 2019, Nuance spun off its automotive division as the company Cerence, allowing it to focus on health care applications. === Acquisition by Microsoft === On April 12, 2021, Microsoft announced that it would buy Nuance Communications for $19.7 billion, or $56 a share, a 22% increase over the previous closing price. Nuance's CEO, Mark Benjamin, stayed with the company. This was Microsoft's second-biggest acquisition up to that point, after its purchase of LinkedIn for $24 billion (~$30.7 billion in 2024) in 2016. Shortly after the deal, the Competition and Markets Authority, a UK regulatory body, stated it was looking into the deal on the basis of antitrust concerns. In December 2021, it was reported that the deal would be approved by the European Union. The acquisition was completed on March 4, 2022. In May 2023, Nuance announced an unspecified number of layoffs.

    Read more →
  • Talim (textiles)

    Talim (textiles)

    Talim (Kashmiri: تعليم, Kashmiri pronunciation: [t̪əːliːm], Urdu: تَعْلِیم, Arabic: تعليم, pronounced [taʕ.liːm] ) in textiles is a symbolic code and system of notation that facilitates the creation of intricate patterns in fabrics, such as shawls and carpets, and the written coded plans that include colour schemes and weaving instructions. The term is used in traditional hand-weaving in the Indian subcontinent. Talim was initially used to create certain types of patterns in Kashmiri shawls, and later came to be applied in the production of carpets. == Etymology and origin == The term talim, which refers to a symbolic code and system of notation used by shawl and carpet artisans in their weaving processes, came to the Urdu language from the Arabic noun taʻlim (تعليم), which means "authoritative instruction", "teaching", or "edification". It means the same in Urdu and Kashmiri. The Arabic noun originated from the second form of the Arabic root verb ʻalima (علم), which means "to know". According to a local belief in Kashmir, talim was introduced to them by Persian scholar and Sufi Muslim saint Mir Sayyid Ali Hamadani. The belief notwithstanding, talim might have originated from Kashmir; Amritsar was the only place outside of Srinagar where talim was used, by migrated Kashmiri artisans. == Technique == Whereas carpets are generally woven horizontally, providing weavers with a clear view of the progress they are making in creating designs, in Kashmir, carpets are woven vertically, so the weaver is reliant on the talim. The talim technique forms fabrics by passing the weft thread as per a given script that has design codes. Weavers use talim to weave the desired pattern with planned colours. Talim involves teamwork when applying the technique, as the process of creating intricate fabric designs in weaving begins with the Naqash (designer, who designs using pencils on graphs) meticulously crafting the design on a blank sheet of paper called a naska, and the master, Talim guru, making the colour codes and symbols for weft yarns that would interlace the warp to construct the desired design. He writes on a long strip of paper, in specific symbols, the colour codes, and the number of knots to be woven with each colour. Taraha guru collaborates with talim guru and is known as the artisan responsible for determining the colours. Talim uthana is a process or the act of "picking the codes" from the graph. A clerk called the Talim Navis would record the step-by-step instructions for these numbers and colours, and thousands of low-paid and interchangeable weavers would read or recite the record to carry out the design. Afterward, a talim copyist makes copies, which are needed when multiple looms weave the same product. The script, which has been encoded, is deciphered and translated according to the specific guidelines of weavers in order to incorporate the design that is included within it. Talim has been compared to "hieroglyphics" or as a "notational-cum-cryptographic system", as it is challenging to decipher and is unique to the shawls of Kashmir, which requires expertise to comprehend. According to researcher Gagan Deep Kaur, "The talim is widely held to be a trade secret of the community and has always been fiercely guarded by the owners." Those who use talim for shawl-making do not assign important tasks to women, because of the fear that the technique and knowledge may be divulged to other communities when the women are sent there to be married. The coded cards known as talim in the Kashmiri language were used for creating certain types of patterns in shawl weaving. The talim technique is employed in the creation of kani shawls, which originated from the Kanihama region of the Kashmir valley. Carpet weaving adapted the technique from shawl making. When Kashmiri artisans started to create carpets, they chose to continue using the talim rather than switching to a different method. The resurgence of the carpet industry in Amritsar during the last century resulted in the prevalent use of the talim technique among the local weavers, a majority of whom hailed from the region of Kashmir. === Recitation of codes === Talim was also communicated through recitation accompanied by a melodic chant or song. In traditional weaving practices, the use of chanting was common. The movement of the shuttles was synchronised with the song of the weaver, adding a musical rhythm to the instructions represented through hieroglyphics. The weaver's chant, "Two blue, one red, three yellow, two blue," served as a guide as they wove and replicated the designated pattern. == Usage == The first factories established in Amritsar around 1860 utilised Bokhara designs. However, Kashmiri weavers maintained their traditional techniques and employed the talim, instead of a cartoon, for tying knots. As a result, Amritsar became the second location in the Indian subcontinent to use the talim. The traditional weaving practices are still carried out in some parts of the Indian subcontinent. The exact date when talim was last used in the subcontinent varies depending on the region and the specific weaving community. Indian textile historian Jasleen Dhamija wrote in her 1989 book Handwoven Fabrics of India that there were still some weavers in the Kashmiri village of Kanihama who applied talim in weaving shawls. As of 2022, the carpet weavers in Kashmir were the only remaining users of talim in carpets, according to Zubair Ahmed, director of the Indian Institute of Carpet Technology. The institute aims to preserve traditional Kashmiri carpet designs by digitising talim and training weavers in the technique. == Gallery ==

    Read more →
  • Control-flow diagram

    Control-flow diagram

    A control-flow diagram (CFD) is a diagram to describe the control flow of a business process, process or review. Control-flow diagrams were developed in the 1950s, and are widely used in multiple engineering disciplines. They are one of the classic business process modeling methodologies, along with flow charts, drakon-charts, data flow diagrams, functional flow block diagram, Gantt charts, PERT diagrams, and IDEF. == Overview == A control-flow diagram can consist of a subdivision to show sequential steps, with if-then-else conditions, repetition, and/or case conditions. Suitably annotated geometrical figures are used to represent operations, data, or equipment, and arrows are used to indicate the sequential flow from one to another. There are several types of control-flow diagrams, for example: Change-control-flow diagram, used in project management Configuration-decision control-flow diagram, used in configuration management Process-control-flow diagram, used in process management Quality-control-flow diagram, used in quality control. In software and systems development, control-flow diagrams can be used in control-flow analysis, data-flow analysis, algorithm analysis, and simulation. Control and data are most applicable for real time and data-driven systems. These flow analyses transform logic and data requirements text into graphic flows which are easier to analyze than the text. PERT, state transition, and transaction diagrams are examples of control-flow diagrams. == Types of control-flow diagrams == === Process-control-flow diagram === A flow diagram can be developed for the process [control system] for each critical activity. Process control is normally a closed cycle in which a sensor. The application determines if the sensor information is within the predetermined (or calculated) data parameters and constraints. The results of this comparison, which controls the critical component. This [feedback] may control the component electronically or may indicate the need for a manual action. This closed-cycle process has many checks and balances to ensure that it stays safe. It may be fully computer controlled and automated, or it may be a hybrid in which only the sensor is automated and the action requires manual intervention. Further, some process control systems may use prior generations of hardware and software, while others are state of the art. === Performance-seeking control-flow diagram === The figure presents an example of a performance-seeking control-flow diagram of the algorithm. The control law consists of estimation, modeling, and optimization processes. In the Kalman filter estimator, the inputs, outputs, and residuals were recorded. At the compact propulsion-system-modeling stage, all the estimated inlet and engine parameters were recorded. In addition to temperatures, pressures, and control positions, such estimated parameters as stall margins, thrust, and drag components were recorded. In the optimization phase, the operating-condition constraints, optimal solution, and linear-programming health-status condition codes were recorded. Finally, the actual commands that were sent to the engine through the DEEC were recorded.

    Read more →