Media preservation

Media preservation

Preservation of documents, pictures, recordings, digital content, etc., is a major aspect of archival science. It is also an important consideration for people who are creating time capsules, family history, historical documents, scrapbooks and family trees. Common storage media are not permanent, and there are few reliable methods of preserving documents and pictures for the future. == Paper/prints (photos) == Color negatives and ordinary color prints may fade away to nothing in a relatively short period if not stored and handled properly. This happens even if the negatives and prints are kept in the dark, because ambient light is not the determining factor, but heat and humidity are. The color degradation is the result of the dyes used in the color processes. Because color processing results in a less stable image than traditional black-and-white processing, black-and-white pictures from the 1920s are more likely to survive long-term than color films and photographs from after the middle 20th century. Black-and-white photographic films using silver halide emulsions are the only film types that have proven to last for archival storage. The determining factors for longevity include the film base type, proper processing (develop, stop, fix and wash) and proper storage. Early films used a Cellulose nitrate base which was prone to decomposition and highly flammable. Nitrate film was replaced with acetate-base films. These Cellulose acetate films were later discovered to outgass acids (also referred to as vinegar syndrome). Acetate films were replaced in the early 1980s by polyester film base materials which have been determined to be more stable than film stocks with a nitrate or acetate base. Color prints made on most inkjet printers look very good at first but they have a very short lifespan, measured in months rather than in years. Even prints from commercial photo labs will start to fade in a matter of years if not processed properly and stored in cool, dry environments. == Documents/books == With documents for which the media are not so critical as what the documents contain, the information in documents can be copied by using photocopiers and image scanners. Books and manuscripts can also have their information saved without destruction by using a book scanner. Where the medium itself needs to be preserved, for example if a document is a crayon sketch by a famous artist on paper, a complex process of preservation may be used. Depending on the condition and importance of the item this can include gluing the media onto more stable media, or protective enclosing of the media. Polyester sleeves, acid-free folders, and pH buffered document boxes are common supportive protective enclosures whose selection must match the media's chemical and physical properties. Other considerations in preserving paper/books are: Damaging light, particularly UV light, which fades and destroys media over time by breaking down the molecules. Atmosphere contains small traces of sulfur dioxide and nitric acid which turn media yellow and break the fibers down. Humidity and moisture also aid in the breakdown of media. If there is too much, the document can be attacked by bacteria, and if too little, cellulose material breaks down. Temperature, particularly elevated ones, can destroy some media. Low temperatures can cause the water to form crystals which expands destroying the structure of paper-based documents. == Online photo albums == Although there are many websites that allow the upload of photographs and videos, digital preservation for the long-term is still considered an issue. There is a lack of confidence that such websites are capable of storing data for long periods of time (ex. 50 years) without data degradation or loss. == Optical media - CD, DVD, Blu-ray, M-Disc == Write-once optical media, such as CD-Rs and DVD-Rs, typically contain an organic dye that distinguishes data reading from data writing based on the dye's transparency along the disc. Conventional CDs and DVDs have finite shelf-life due to natural degradation of the dye; the newer M-DISC uses inorganic material technology to produce molded DVDs and Blu-Rays (up to 3-layer 100GB BDXL) with a claimed lifespan of 100-1000 years if stored correctly with most BD & BDXL rated read/writers enabling the higher power mode for the M-Disc format after 2011. The National Archives and Records Administration lists published life expectancies to be 10 or 25 years or more for normal CDs and DVDs and conservative life expectancies to be between 2 and 5 years. Storage environments, such as temperature and humidity, as well as handling conditions such as frequency of media use and compatibility between the recorder and media, affect media shelf-life. Improvements in media storage and migrations to new recording technologies can make certain formats obsolete within their respective lifespan. Technologists have pointed to internet streaming services, where services such as video-on-demand have contributed to the 33 percent decline in DVD sales the past 5 years, as a challenge for digital preservation. == Magnetic media - video cassettes, tapes, hard drives == Magnetic media such as audio and video tape and floppy disks also have limited life spans. Audio and video tapes require specific care and handling to ensure that the recorded information will be preserved. For information that must be preserved indefinitely, periodic transcription from old media to new ones is necessary, not only because the media are unstable but also because the recording technology may become obsolete. Magnetic media also deteriorates naturally with typical shelf lives between 10 and 20 years. Magnetic tape can degrade from binder hydrolysis or magnetic remanence decay. Binder hydrolysis, also known as sticky-shed syndrome, refers to the breakdown of binder, or glue, that holds the magnetic particles to the polyester base of the tape. Tapes which have been stored in hot, humid conditions are particularly vulnerable to this phenomenon and may suffer from accelerated degradation. Severe binder can cause the magnetic material to fall off or sheds from the base, leaving a pile of dust and clear backing. Archivists can bake the tape, which evaporates water molecules on the tape, to temporarily restore the binder before making a copy. Magnetic tape can also be destabilized by magnetic remanence decay, which refers to the weakening of the tape's magnetization over time. This weakens the affected tape's readability, leading to reduced sound clarity and volume or picture hue and contrast. Baking the tape will not restore magnetization. Media at risk include recorded media such as master audio recordings of symphonies and videotape recordings of the news gathered over the last 40 years. Threats to media that must be considered when archiving important record media include accidental erasure, physical loss due to disasters such as fires and floods, and media degradation. Along with the actual media being degraded over the years, the machines that are available to play back or reproduce the audio sources are becoming archaic themselves. Manufacturers and their support (parts, technical updates) for their machines have disappeared throughout the years. Even if the medium is vaulted and archived correctly, the mechanical properties of the machines have deteriorated to the point that they could do more harm than good to the tape being played. Many major film studios are now backing up their libraries by converting them to electronic media files, such as .AIFF or .WAV-based files via digital audio workstations. That way, even if the digital platform manufacturer goes out of business or no longer supports their product, the files can still be played on any common computer. There is a detailed process that must take place previous to the final archival product now that a digital solution is in place. Sample rates and their conversion and reference speed are both critical in this process. In floppy disks, the lubricants inside the plastic jackets of many older floppies promote the decay of the magnetic medium. Also, the alignment of the magnetic particles of the disk substrate may gradually degrade, leading to a loss of formatting and data. Early laser disk media were prone to degradation as the layers of the disk substrate were bonded with an adhesive that was vulnerable to decay and would crumble over time. This would lead the different layers of the disk to peel apart, damaging the pitted data surface and rendering the disk unreadable.

Anthem medical data breach

The Anthem medical data breach was a medical data breach of information held by Elevance Health, known at that time as Anthem Inc. On February 4, 2015, Anthem, Inc. disclosed that criminal hackers had broken into its servers and had potentially stolen over 37.5 million records that contain personally identifiable information from its servers. On February 24, 2015 Anthem raised the number to 78.8 million people whose personal information had been affected. According to Anthem, Inc., the data breach extended into multiple brands Anthem, Inc. uses to market its healthcare plans, including, Anthem Blue Cross, Anthem Blue Cross and Blue Shield, Blue Cross and Blue Shield of Georgia, Empire Blue Cross and Blue Shield, Amerigroup, Caremore, and UniCare. Healthlink says that it was also a victim. Anthem says users' medical information and financial data were not compromised. Anthem has offered free credit monitoring in the wake of the breach. Michael Daniel, chief adviser on cybersecurity for President Barack Obama, said he would be changing his own password. According to The New York Times, about 80 million company records were hacked, and there is a fear that the stolen data will be used for identity theft. The compromised information contained names, birthdays, medical IDs, social security numbers, street addresses, e-mail addresses and employment information, including income data. == Theft of the data == The data was stolen over a period of weeks the month before the data breach was discovered. Because no medical information was compromised, Anthem was not required by law to encrypt the data. However, Anthem faced several civil class-action lawsuits, which were settled in 2017 at a cost of $115 million. Anthem did not admit any wrongdoing in the settlement. Data from the attack is expected to be sold on the black market. == Impact == Persons whose data was stolen could have resulting problems about identity theft for the rest of their lives. Anthem had a US$100 million insurance policy for cyber problems from American International Group. One report suggested that all of this money could be consumed by the process of notifying customers of the breach. == Responses == Anthem hired Mandiant, a cybersecurity firm, to review their security systems and advised people whose data was stolen to monitor their accounts and remain vigilant. The theft of the data raised fears generally about the theft of medical information. A writer from Harvard Law School suggested that this data breach might spark reform of security practices and government data safety regulation. An investigation conducted by several state insurance commissioners blames the breach on an attacker whose identity was withheld, and claims that the breach was likely ordered by a foreign government whose name was withheld. It also concluded that Anthem had taken reasonable measures to protect its data before the breach and that its remediation plan was effective at shutting down the breach once it was discovered. It also marks the starting date of the breach as February 18, 2014. The lead investigator was the Indiana Department of Insurance (DOI) -- Anthem's principal regulator, because Anthem is headquartered in Indiana. The Indiana DOI hired independent auditors to conduct a security assessment at Anthem, which concluded, "While deficiencies within Anthem’s cybersecurity posture were noted by the Examination Team, these deficiencies were not, in our experience, uncommon to companies comparable to Anthem in size and scope. While the pre-breach deficiencies impacted Anthem’s ability to reduce the likelihood of and quickly detect the Data Breach, the controls implemented subsequent to the Data Breach should improve Anthem’s ability to detect future breaches and enable Anthem to respond more effectively to a future attack than was the case in this instance." Federal regulators also conducted an investigation of the Anthem data breach, resulting in a $16 million settlement between Anthem and the Department of Health and Human Services (HHS) -- by far the largest HHS data breach settlement. An HHS Director overseeing the investigation said, "The largest health data breach in U.S. history fully merits the largest HIPAA settlement in history. Unfortunately, Anthem failed to implement appropriate measures for detecting hackers who had gained access to their system to harvest passwords and steal people's private information." The HHS settlement also required Anthem to perform a risk assessment and correct any identified deficiencies in its cybersecurity, with HHS oversight of Anthem's progress. Approximately 100 private class action lawsuits were filed against Anthem over the data breach and consolidated in California federal court, in front of Judge Koh, a respected authority in data breach litigation. After contested briefing over who should lead the litigation efforts, Judge Koh appoints Eve Cervantez of Altshuler Berzon and Andy Friedman of Cohen Milstein as co-lead counsel, and appointed Eric Gibbs of Gibbs Law Group and Michael Sobel of Lieff Cabraser to head a Plaintiffs' Steering Committee. In 2017, Anthem agreed to settle the litigation for $115 million, the largest ever data breach settlement at the time. The attorneys requested $38 million in fees for their work on the case, but Judge Koh slashed the fee request, finding that only $31 million in fees were merited.

Void Trilogy

The Void Trilogy is a space opera series by British author Peter F. Hamilton. The series is set in the same universe as The Commonwealth Saga, 1,200 years after the end of Judas Unchained. Peter F. Hamilton sold the American rights to the series to Random House. The series includes the following books: The Dreaming Void (2007) The Temporal Void (2008) The Evolutionary Void (2010) == Synopsis == === The Dreaming Void === What was formerly believed to be a supermassive black hole at the centre of the Milky Way is revealed to be an artificial construct, known as the Void. Inside, there is a strange universe where the laws of physics are very different from standard physics. It is slowly consuming the other stars of the galactic core—one day it will have devoured the entire galaxy. In AD 3320, a human member of the Commonwealth, Inigo, begins to have dreams of the wonderful existence inside the Void. His dreams inspire the disaffected, who desire to travel into the Void, where their every wish will be fulfilled. By AD 3456, the pseudo-religious Living Dream movement exceeds 5 billion members, organizing the followers into a powerful political force. Other star-faring species fear their migration will cause the Void to expand again thus devouring the galaxy. They are prepared to stop the pilgrimage fleet no matter what the cost. The Dreaming Void is broken into two distinct sections. The first follows Edeard, a young boy who lives inside the Void on a planet called Querencia, the subject of Inigo's dreams. Edeard, an orphan and apprentice, lives in Ashwell, a town in Rulan province. A gifted psychic, he is trained by Master Akeem in crafting and modding. Initially a loner, he comes to prominence in his village after designing an alternative pump mechanism for the local well. Unfortunately his luck changes for the worse after Ashwell is raided by bandits. Forced to flee, he joins the local caravan and travels to Makkathran, the capital of Querencia. In Makkathran, Edeard joins the constables and after a brutal couple of months in training, he graduates and is promoted to the commander of his Squad. He makes little progress battling the rigid and backward judicial system of Makkathran; his first real break is when his squad overcomes a trap set by the local gang, and Edeard walks on water chasing the leader of the gang. A testament to his growing psychic abilities, Edeard's stunt earns him the title of Waterwalker, and he becomes an instant star in Makkathran. The second section of The Dreaming Void is set back in the Commonwealth. Inigo, the first dreamer, and founder of Living Dream, has disappeared, leaving the 5 billion strong Living Dream movement in a state of flux. When Ethan, succeeding Inigo as the head of the movement, proclaims that the Living Dream will embark on a pilgrimage into the Void, the Commonwealth is thrown into a state of political chaos. Fearing that the human migration might cause the Void to expand (and in the process destroy whole systems or even the whole Galaxy) other spacefaring races such as the Raiel and Ocisen Empire are deeply concerned, with the latter threatening military action. This has left the Commonwealth government deeply divided, with the two largest factions in disagreement, the Accelerators faction/party supporting the pilgrimage and the Conservative faction opposing. As both parties are unable to solve the situation politically they have resolved to take matters into their own hands, with each party sending agents to further its interests. Aaron, a sleeper cell agent, is tasked with finding Inigo. He kidnaps and manipulates Corrie-Lyn, a former lover of Inigo and interrogates her for information. He also travels to Kuhmo (Inigo's homeworld) to get further information and robs Inigo's secure storage (a bank for memory). He eventually tracks Inigo to Hanko, a desolate and barren world. However, before Aaron can extract Inigo, Accelerator agents destroy Aaron's starship leaving him marooned on Hanko. Meanwhile, Accelerator agents make a deal with Ethan, agreeing to give the Living Dream movement Ultra Drives to power their ships. Accelerator plans are halted when the Delivery Man, a Conservative party agent, destroys valuable FTL Drive tech. Troblum, an Accelerator physicist, also defects, further slowing the Accelerators plans. === The Temporal Void === The Temporal Void picks up after The Dreaming Void. The Intersolar Commonwealth faces mounting turmoil as the deadline for Living Dream's Pilgrimage into the Void approaches. An Ocisen Empire fleet advances on a mission of genocide, while an internecine war erupts among post-human factions over humanity's future. Amidst the chaos, investigator Paula Myo struggles to counter the increasingly desperate actions of various agents and factions. Relentless in her pursuit, she contends with adversaries from her distant past and colleagues of uncertain loyalty, all while racing against time. At the center of the unfolding crisis is Edeard the Waterwalker, a figure from the distant past who lived deep within the Void. As the messiah of Living Dream, his life—broadcast through visions—captivates and inspires billions. His story fuels the Pilgrimage's momentum, a force seemingly impossible to stop. As Edeard approaches his ultimate victory, the true nature of the Void is finally revealed. === The Evolutionary Void === The Evolutionary Void picks up after The Temporal Void. Exposed as the Second Dreamer, Araminta has become the target of a galaxy-wide search by government agent Paula Myo and the psychopath known as the Cat, along with others equally determined to prevent, or facilitate, the pilgrimage of the Living Dream cult into the heart of the Void. An indestructible microuniverse, the Void may contain paradise, as the cultists believe, but it is also a deadly threat. For the miraculous reality that exists inside its boundaries demands energy, energy drawn from everything outside those boundaries: from planets, stars, galaxies, and everything that lives, for the Pilgrimage will trigger a super-massive expansion of the Void. Meanwhile, the parallel story of Edeard, the Waterwalker, as told through a series of dreams communicated to the gaiafield via Inigo, the First Dreamer, continues to unfold. But the inspirational tale of this idealistic young man takes a darker and more troubling turn as he finds himself faced with powerful new enemies, and temptations more powerful still, to reach fulfilment in the end. Named a Silfen Friend like her ancestress Mellanie, Araminta chooses to face her unwanted responsibilities, with no guarantee of success or survival. She takes on the role of Second Dreamer to lead the first wave of Living Dream, 24 million people, into the Void, leaving everyone confused and lost by her actions. However, in actuality, she is playing a double game. Using her original body to lead the Living Dream as a diversion, she borrows one of her fiancé's (Mr. Bovey) bodies to set out to destroy the Void. She is able to connect with a Skylord and travel the Silfen Paths. With time running out, a repentant Inigo decides to release Edeard's final dream whose message is scarcely less dangerous than the pilgrimage promises to be, where perfection is achieved, so that nothing else is left to strive for and the human race in the Void has started to devolve. He goes to the Spike to meet Ozzie and stays there to meet with Araminta, who is using one of her fiancé's bodies, and Oscar. Third Dreamer Gore Burnelli has a plan to reason with the Heart, the core of the Void. He secures the help of the Delivery Man and travels to the Anomine homeworld to retrieve the mechanism that allowed them to go post-physical. He is able to connect with Justine, his daughter, who is currently in the Void, by way of Dreams. The monomaniacal Ilanthe, leader of the breakaway Accelerator Faction, seeks dominion in the Void. It is not Fusion with the Void to attain post-physical status that she wants, but to have control over everything. Using Dark Fortress technology, she sets up a barrier around the Sol system which leaves ANA and the deterrence fleet trapped inside. It is this technology which she has equipped the ships travelling to the Void with, the ability to create a forcefield which the Warrior Raiel cannot penetrate. == Technology == The Commonwealth uses a number of advanced technologies. In the early days of the Commonwealth, humans used static and permanently opened wormholes to travel from planet to planet. However, after the events of the Starflyer War (detailed in the Commonwealth Saga), the CST corporation's monopoly on space travel was ended. With the advent of wormholes that could wrap around ships, the Commonwealth saw a shift from wormholes to spaceships. Another development in the Commonwealth is the gaiafield. Developed by Ozzie Issac in AD 3000, the gaiafield is based on Silfen technology; when Ozzie was named a friend of the Silfen during the Starflye

Thompson sampling

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that address the exploration–exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief. == Description == Consider a set of contexts X {\displaystyle {\mathcal {X}}} , a set of actions A {\displaystyle {\mathcal {A}}} , and rewards in R {\displaystyle \mathbb {R} } . The aim of the player is to play actions under the various contexts, such as to maximize the cumulative rewards. Specifically, in each round, the player obtains a context x ∈ X {\displaystyle x\in {\mathcal {X}}} , plays an action a ∈ A {\displaystyle a\in {\mathcal {A}}} and receives a reward r ∈ R {\displaystyle r\in \mathbb {R} } following a distribution that depends on the context and the issued action. The elements of Thompson sampling are as follows: a likelihood function P ( r | θ , a , x ) {\displaystyle P(r|\theta ,a,x)} ; a set Θ {\displaystyle \Theta } of parameters θ {\displaystyle \theta } of the distribution of r {\displaystyle r} ; a prior distribution P ( θ ) {\displaystyle P(\theta )} on these parameters; past observations triplets D = { ( x ; a ; r ) } {\displaystyle {\mathcal {D}}=\{(x;a;r)\}} ; a posterior distribution P ( θ | D ) ∝ P ( D | θ ) P ( θ ) {\displaystyle P(\theta |{\mathcal {D}})\propto P({\mathcal {D}}|\theta )P(\theta )} , where P ( D | θ ) {\displaystyle P({\mathcal {D}}|\theta )} is the likelihood function. Thompson sampling consists of playing the action a ∗ ∈ A {\displaystyle a^{\ast }\in {\mathcal {A}}} according to the probability that it maximizes the expected reward; action a ∗ {\displaystyle a^{\ast }} is chosen with probability ∫ I [ E ( r | a ∗ , x , θ ) = max a ′ E ( r | a ′ , x , θ ) ] P ( θ | D ) d θ , {\displaystyle \int \mathbb {I} \left[\mathbb {E} (r|a^{\ast },x,\theta )=\max _{a'}\mathbb {E} (r|a',x,\theta )\right]P(\theta |{\mathcal {D}})d\theta ,} where I {\displaystyle \mathbb {I} } is the indicator function. In practice, the rule is implemented by sampling. In each round, parameters θ ∗ {\displaystyle \theta ^{\ast }} are sampled from the posterior P ( θ | D ) {\displaystyle P(\theta |{\mathcal {D}})} , and an action a ∗ {\displaystyle a^{\ast }} chosen that maximizes E [ r | θ ∗ , a ∗ , x ] {\displaystyle \mathbb {E} [r|\theta ^{\ast },a^{\ast },x]} , i.e. the expected reward given the sampled parameters, the action, and the current context. Conceptually, this means that the player instantiates their beliefs randomly in each round according to the posterior distribution, and then acts optimally according to them. In most practical applications, it is computationally onerous to maintain and sample from a posterior distribution over models. As such, Thompson sampling is often used in conjunction with approximate sampling techniques. == History == Thompson sampling was originally described by Thompson in 1933. It was subsequently rediscovered numerous times independently in the context of multi-armed bandit problems. A first proof of convergence for the bandit case has been shown in 1997. The first application to Markov decision processes was in 2000. A related approach (see Bayesian control rule) was published in 2010. In 2010 it was also shown that Thompson sampling is instantaneously self-correcting. Asymptotic convergence results for contextual bandits were published in 2011. Thompson Sampling has been widely used in many online learning problems including A/B testing in website design and online advertising, and accelerated learning in decentralized decision making. A Double Thompson Sampling (D-TS) algorithm has been proposed for dueling bandits, a variant of traditional MAB, where feedback comes in the form of pairwise comparison. == Relationship to other approaches == === Probability matching === Probability matching is a decision strategy in which predictions of class membership are proportional to the class base rates. Thus, if in the training set positive examples are observed 60% of the time, and negative examples are observed 40% of the time, the observer using a probability-matching strategy will predict (for unlabeled examples) a class label of "positive" on 60% of instances, and a class label of "negative" on 40% of instances. === Bayesian control rule === A generalization of Thompson sampling to arbitrary dynamical environments and causal structures, known as Bayesian control rule, has been shown to be the optimal solution to the adaptive coding problem with actions and observations. In this formulation, an agent is conceptualized as a mixture over a set of behaviours. As the agent interacts with its environment, it learns the causal properties and adopts the behaviour that minimizes the relative entropy to the behaviour with the best prediction of the environment's behaviour. If these behaviours have been chosen according to the maximum expected utility principle, then the asymptotic behaviour of the Bayesian control rule matches the asymptotic behaviour of the perfectly rational agent. The setup is as follows. Let a 1 , a 2 , … , a T {\displaystyle a_{1},a_{2},\ldots ,a_{T}} be the actions issued by an agent up to time T {\displaystyle T} , and let o 1 , o 2 , … , o T {\displaystyle o_{1},o_{2},\ldots ,o_{T}} be the observations gathered by the agent up to time T {\displaystyle T} . Then, the agent issues the action a T + 1 {\displaystyle a_{T+1}} with probability: P ( a T + 1 | a ^ 1 : T , o 1 : T ) , {\displaystyle P(a_{T+1}|{\hat {a}}_{1:T},o_{1:T}),} where the "hat"-notation a ^ t {\displaystyle {\hat {a}}_{t}} denotes the fact that a t {\displaystyle a_{t}} is a causal intervention (see Causality), and not an ordinary observation. If the agent holds beliefs θ ∈ Θ {\displaystyle \theta \in \Theta } over its behaviors, then the Bayesian control rule becomes P ( a T + 1 | a ^ 1 : T , o 1 : T ) = ∫ Θ P ( a T + 1 | θ , a ^ 1 : T , o 1 : T ) P ( θ | a ^ 1 : T , o 1 : T ) d θ {\displaystyle P(a_{T+1}|{\hat {a}}_{1:T},o_{1:T})=\int _{\Theta }P(a_{T+1}|\theta ,{\hat {a}}_{1:T},o_{1:T})P(\theta |{\hat {a}}_{1:T},o_{1:T})\,d\theta } , where P ( θ | a ^ 1 : T , o 1 : T ) {\displaystyle P(\theta |{\hat {a}}_{1:T},o_{1:T})} is the posterior distribution over the parameter θ {\displaystyle \theta } given actions a 1 : T {\displaystyle a_{1:T}} and observations o 1 : T {\displaystyle o_{1:T}} . In practice, the Bayesian control amounts to sampling, at each time step, a parameter θ ∗ {\displaystyle \theta ^{\ast }} from the posterior distribution P ( θ | a ^ 1 : T , o 1 : T ) {\displaystyle P(\theta |{\hat {a}}_{1:T},o_{1:T})} , where the posterior distribution is computed using Bayes' rule by only considering the (causal) likelihoods of the observations o 1 , o 2 , … , o T {\displaystyle o_{1},o_{2},\ldots ,o_{T}} and ignoring the (causal) likelihoods of the actions a 1 , a 2 , … , a T {\displaystyle a_{1},a_{2},\ldots ,a_{T}} , and then by sampling the action a T + 1 ∗ {\displaystyle a_{T+1}^{\ast }} from the action distribution P ( a T + 1 | θ ∗ , a ^ 1 : T , o 1 : T ) {\displaystyle P(a_{T+1}|\theta ^{\ast },{\hat {a}}_{1:T},o_{1:T})} . === Upper-confidence-bound (UCB) algorithms === Thompson sampling and upper-confidence bound algorithms share a fundamental property that underlies many of their theoretical guarantees. Roughly speaking, both algorithms allocate exploratory effort to actions that might be optimal and are in this sense "optimistic". Leveraging this property, one can translate regret bounds established for UCB algorithms to Bayesian regret bounds for Thompson sampling or unify regret analysis across both these algorithms and many classes of problems.

Rumelhart Prize

The David E. Rumelhart Prize for Contributions to the Theoretical Foundations of Human Cognition was founded in 2001 in honor of the cognitive scientist David Rumelhart to introduce the equivalent of a Nobel Prize for cognitive science. It is awarded annually to "an individual or collaborative team making a significant contemporary contribution to the theoretical foundations of human cognition". The annual award is presented at the Cognitive Science Society meeting, where the recipient gives a lecture and receives a check for $100,000. At the conclusion of the ceremony, the next year's award winner is announced. The award is funded by the Robert J. Glushko and Pamela Samuelson Foundation. The Rumelhart Prize committee is independent of the Cognitive Science Society. However, the society provides a large and interested audience for the awards. == Selection Committee == As of 2022, the selection committee for the prize consisted of: Richard Cooper (chair) Dedre Gentner Robert J. Glushko Tania Lombrozo Steven T. Piantadosi Jesse Snedeker == Recipients ==

DBGallery

DBGallery, short for Database Gallery, is a cloud-based Software as a Service (SaaS) and on-prem webserver for teams of various sizes. DBGallery enables users to centrally store, manage, catalog, archive, and securely share image, video, and document files. It facilitates version control, detects duplicates, and offers an intuitive and advanced search functionality, making assets easily accessible to all users. It takes advantage of current AI technologies to automatically add significant metadata to images, facilitates custom-trained AI models, and offers bespoke AI features. Additionally, DBGallery provides team management tools, workflow management, an activity audit trail, and other collaborative features that foster a productive environment for both internal and external stakeholders. == History == DBGallery's first public release was December 2007. Since then each year has seen continuous enhancements. 2013 added support for additional non-English languages in its meta-data. 2014 added support for creating custom data fields for tagging and search. In 2015 included the ability to auto-tag images using Reverse Geocoding. 2018 added artificial intelligence (AI) image recognition as a further addition to auto-tagging. March 2020 added complete image collection management via the web (e.g. file and folder drag and drop), a new collection dashboard, custom data layouts, and an improved audit trail. 2021 saw user experience improvements provided by improved styling and performance enhancements. Version 12 was released in October 2021. It added the ability to upload unlimited file sizes and made significant performance improvements for very large collections. June 2022 saw the release of a global duplicate images search. In late 2022, DBGallery began offering significantly reduced cloud storage cost, at a third of its previous prices, which played into its recent high-volume/high-capacity capabilities and its clients' subsequent demand for additional storage. 2023 saw improvements in user and role management, introduced it's mobile app (PWA), and improved custom-trained object detection. Release 14.0 in the spring of 2024 had large sharing improvements and a new find related images feature. Winter 2025's v15 release introduced AI-generated image descriptions, image-to-text, and facial recognition.

Ensemble averaging (machine learning)

In machine learning, ensemble averaging is the process of creating multiple models (typically artificial neural networks) and combining them to produce a desired output, as opposed to creating just one model. Ensembles of models often outperform individual models, as the various errors of the ensemble constituents "average out". == Overview == Ensemble averaging is one of the simplest types of committee machines. Along with boosting, it is one of the two major types of static committee machines. In contrast to standard neural network design, in which many networks are generated but only one is kept, ensemble averaging keeps the less satisfactory networks, but with less weight assigned to their outputs. The theory of ensemble averaging relies on two properties of artificial neural networks: In any network, the bias can be reduced at the cost of increased variance In a group of networks, the variance can be reduced at no cost to the bias. This is known as the bias–variance tradeoff. Ensemble averaging creates a group of networks, each with low bias and high variance, and combines them to form a new network which should theoretically exhibit low bias and low variance. Hence, this can be thought of as a resolution of the bias–variance tradeoff. The idea of combining experts can be traced back to Pierre-Simon Laplace. == Method == The theory mentioned above gives an obvious strategy: create a set of experts with low bias and high variance, and average them. Generally, what this means is to create a set of experts with varying parameters; frequently, these are the initial synaptic weights of a neural network, although other factors (such as learning rate, momentum, etc.) may also be varied. Some authors recommend against varying weight decay and early stopping. The steps are therefore: Generate N experts, each with their own initial parameters (these values are usually sampled randomly from a distribution) Train each expert separately Combine the experts and average their values. Alternatively, domain knowledge may be used to generate several classes of experts. An expert from each class is trained, and then combined. A more complex version of ensemble average views the final result not as a mere average of all the experts, but rather as a weighted sum. If each expert is y i {\displaystyle y_{i}} , then the overall result y ~ {\displaystyle {\tilde {y}}} can be defined as: y ~ ( x ; α ) = ∑ j = 1 p α j y j ( x ) {\displaystyle {\tilde {y}}(\mathbf {x} ;\mathbf {\alpha } )=\sum _{j=1}^{p}\alpha _{j}y_{j}(\mathbf {x} )} where α {\displaystyle \mathbf {\alpha } } is a set of weights. The optimization problem of finding alpha is readily solved through neural networks, hence a "meta-network" where each "neuron" is in fact an entire neural network can be trained, and the synaptic weights of the final network is the weight applied to each expert. This is known as a linear combination of experts. It can be seen that most forms of neural network are some subset of a linear combination: the standard neural net (where only one expert is used) is simply a linear combination with all α j = 0 {\displaystyle \alpha _{j}=0} and one α k = 1 {\displaystyle \alpha _{k}=1} . A raw average is where all α j {\displaystyle \alpha _{j}} are equal to some constant value, namely one over the total number of experts. A more recent ensemble averaging method is negative correlation learning, proposed by Y. Liu and X. Yao. This method has been widely used in evolutionary computing. == Benefits == The resulting committee is almost always less complex than a single network that would achieve the same level of performance The resulting committee can be trained more easily on smaller datasets The resulting committee often has improved performance over any single model The risk of overfitting is lessened, as there are fewer parameters (e.g. neural network weights) which need to be set.