AI Assistant Reddit

AI Assistant Reddit — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

Be My Eyes

Be My Eyes is a Danish mobile app that aims to help blind and visually impaired people to recognize objects and manage everyday situations. An online community of sighted volunteers receive photos or videos from randomly assigned affected individuals and assist via live chat. In 2023, the company launched Be My AI, an AI-based interface to help blind and visually impaired users describe images. The app is currently available for Android, iOS, and Windows. == History == === Founding and early years === The app was developed and marketed by Hans Jørgen Wiberg. He had demonstrated that although there are video chat software such as Skype and FaceTime, none is tailored for the visually impaired. For development, he joined forces with the Danish Association of the Blind, and other organizations. The app was first presented at an event for start-up companies in 2012 and first released in 2015. A version for Android was released in 2017, in addition to the iOS version. Praise was given for easy use of the app. The lack of sufficient data protection, which makes it possible to pass on data to third parties, was criticized. === Recent developments === The company has raised over $650,000, including funding from Silicon Valley, Microsoft, and other angel investors. In February 2020, $2.8 million in Series A funding was raised, allowing the company to further develop its business model while keeping visual support services free for visually impaired users. The investment allows the company to further develop its unique "purpose and profit" business model while keeping the visual support service free and unlimited for all visually impaired users. === User base and accessibility === Over 9.3 million volunteers and 900,000 blind or visually impaired people use the app. == Features == === Human-based assistance === A visually impaired person starts a live stream showing their view from their cellphone camera. They are assigned, through a phone call or chat, a random volunteer who speaks the same language and who is in the same time zone. This allows the volunteer to describe an object and assist the visually impaired person, such as guiding the person to move their camera, read instructions, or clean up a spill. Through speech synthesis, content can be read out loud. This process encourages a more independent life for blind and visually impaired people. === Be My AI === In March of 2023, Be My Eyes launched Be My AI, an AI-based virtual assistant. Be My AI is accessible through the Be My Eyes app, and is based on OpenAI's GPT-4 large language model. Through the interface, the app allows blind and visually impaired users to send images from a variety of devices to be described. The app allows users to then follow up with questions to further tailor the image description. Blind users report using Be My AI for a variety of tasks, including reading menus, identifying clothing, and describing people. The Be My AI interface is available on Android, iOS, and Windows. Within a few weeks of the interface's roll out, the company reported that it had been used one million times, and it was named among Time's best inventions of 2023. Be My AI is part of a growing number of AI-based apps and devices designed to help blind and visually impaired individuals. == Partnerships == === Microsoft === In November 2023, Be My Eyes entered a partnership with Microsoft to share data to help improve accessibility-focused AI models. === Meta === In 2024, Be My Eyes integrated with Ray-Ban Meta smart glasses, a wearable product developed by Meta and EssilorLuxottica. The partnership enabled users to receive hands-free, real-time visual descriptions and volunteer assistance by using voice commands through the smart glasses. === Hilton === In October 2024, Hilton partnered with Be My Eyes to provide live video assistance for blind and low-vision guests. The free service connects travelers to a Hilton team member that can guide them through tasks like adjusting thermostats, opening window shades, or navigating hotel amenities. This collaboration progressed from a prior arrangement where Hilton helped train Be My Eyes' GPT-4 powered AI model to better recognize objects and layouts in hotel rooms. === Tesco === In October 2025, retailer Tesco announced its partnership with Be My Eyes to launch a six-month pilot aimed at improving in-store accessibility in the UK. The initiative was launched on World Sight Day, 9 October, enabling Be My Eyes users to connect directly with Tesco staff via the app for personalised visual assistance while shopping, Euronewsweek reported. == Awards == Nordic Startup Awards for "Best Social Entrepreneurial Tech Startup" in Denmark 2021 Apple Design Award for best social impact
Read more →
Combs method

The Combs method is a rule base reduction method of writing fuzzy logic rules described by William E. Combs in 1997. It is designed to prevent combinatorial explosion in fuzzy logic rules. The Combs method takes advantage of the logical equality ( ( p ∧ q ) ⇒ r ) ⟺ ( ( p ⇒ r ) ∨ ( q ⇒ r ) ) {\displaystyle ((p\land q)\Rightarrow r)\iff ((p\Rightarrow r)\lor (q\Rightarrow r))} . == Equality proof == The simplest proof of given equality involves usage of truth tables: == Combinatorial explosion == Suppose we have a fuzzy system that considers N variables at a time, each of which can fit into at least one of S sets. The number of rules necessary to cover all the cases in a traditional fuzzy system is S N {\displaystyle S^{N}} , whereas the Combs method would need only S × N {\displaystyle S\times N} rules. For example, if we have five sets and five variables to consider to produce one output, covering all the cases would require 3125 rules in a traditional system, while the Combs method would require only 25 rules, taming the combinatorial explosion that occurs when more inputs or more sets are added to the system. This article will focus on the Combs method itself. To learn more about the way rules are traditionally formed, see fuzzy logic and fuzzy associative matrix. == Example == Suppose we were designing an artificial personality system that determined how friendly the personality is supposed to be towards a person in a strategic video game. The personality would consider its own fear, trust, and love in the other person. A set of rules in the Combs system might look like this: The table translates to: [IF Fear IS Unafraid THEN Friendship IS Enemies OR IF Fear IS ModerateFear THEN Friendship IS Neutral OR IF Fear IS Afraid THEN Friendship IS GoodFriends ] OR [IF Trust IS Distrusting THEN Friendship IS Enemies OR IF Trust IS ModerateTrust THEN Friendship IS Neutral OR IF Trust IS Trusting THEN Friendship IS GoodFriends] OR [IF Love IS Unloving THEN Friendship IS Enemies OR IF Love IS ModerateLove THEN Friendship IS Neutral OR IF Love IS Loving THEN Friendship IS GoodFriends] In this case, because the table follows a straightforward pattern in the output, it could be rewritten as: Each column of the table maps to the output provided in the last row. To obtain the output of the system, we just average the outputs of each rule for that output. For example, to calculate how much the computer is Enemies with the player, we take the average of how much the computer is Unafraid, Distrusting, and Unloving of the player. When all three averages are obtained, the result can then be defuzzified by any of the traditional means.
Read more →
Fuzzy control system

A fuzzy control system is a control system based on fuzzy logic – a mathematical system that analyzes analog input values in terms of logical variables that take on continuous values between 0 and 1, in contrast to classical or digital logic, which operates on discrete values of either 1 or 0 (true or false, respectively). Fuzzy logic is widely used in machine control. The term "fuzzy" refers to the fact that the logic involved can deal with concepts that cannot be expressed as the "true" or "false" but rather as "partially true". Although alternative approaches such as genetic algorithms and neural networks can perform just as well as fuzzy logic in many cases, fuzzy logic has the advantage that the solution to the problem can be cast in terms that human operators can understand, such that that their experience can be used in the design of the controller. This makes it easier to mechanize tasks that are already successfully performed by humans. == History and applications == Fuzzy logic was proposed by Lotfi A. Zadeh of the University of California at Berkeley in a 1965 paper. He elaborated on his ideas in a 1973 paper that introduced the concept of "linguistic variables", which in this article equates to a variable defined as a fuzzy set. Other research followed, with the first industrial application, a cement kiln built in Denmark, coming on line in 1976. Fuzzy systems were initially implemented in Japan. Interest in fuzzy systems was sparked by Seiji Yasunobu and Soji Miyamoto of Hitachi, who in 1985 provided simulations that demonstrated the feasibility of fuzzy control systems for the Sendai Subway. Their ideas were adopted, and fuzzy systems were used to control accelerating, braking, and stopping when the Namboku Line opened in 1987. In 1987, Takeshi Yamakawa demonstrated the use of fuzzy control, through a set of simple dedicated fuzzy logic chips, in an "inverted pendulum" experiment. This is a classic control problem, in which a vehicle tries to keep a pole mounted on its top by a hinge upright by moving back and forth. Yamakawa subsequently made the demonstration more sophisticated by mounting a wine glass containing water and even a live mouse to the top of the pendulum: the system maintained stability in both cases. Yamakawa eventually went on to organize his own fuzzy-systems research lab to help exploit his patents in the field. Japanese engineers subsequently developed a wide range of fuzzy systems for both industrial and consumer applications. In 1988 Japan established the Laboratory for International Fuzzy Engineering (LIFE), a cooperative arrangement between 48 companies to pursue fuzzy research. The automotive company Volkswagen was the only foreign corporate member of LIFE, dispatching a researcher for a duration of three years. Japanese consumer goods often incorporate fuzzy systems. Matsushita vacuum cleaners use microcontrollers running fuzzy algorithms to interrogate dust sensors and adjust suction power accordingly. Hitachi washing machines use fuzzy controllers to load-weight, fabric-mix, and dirt sensors and automatically set the wash cycle for the best use of power, water, and detergent. Canon developed an autofocusing camera that uses a charge-coupled device (CCD) to measure the clarity of the image in six regions of its field of view and use the information provided to determine if the image is in focus. It also tracks the rate of change of lens movement during focusing, and controls its speed to prevent overshoot. The camera's fuzzy control system uses 12 inputs: 6 to obtain the current clarity data provided by the CCD and 6 to measure the rate of change of lens movement. The output is the position of the lens. The fuzzy control system uses 13 rules and requires 1.1 kilobytes of memory. An industrial air conditioner designed by Mitsubishi uses 25 heating rules and 25 cooling rules. A temperature sensor provides input, with control outputs fed to an inverter, a compressor valve, and a fan motor. Compared to the previous design, the fuzzy controller heats and cools five times faster, reduces power consumption by 24%, increases temperature stability by a factor of two, and uses fewer sensors. Other applications investigated or implemented include: character and handwriting recognition; optical fuzzy systems; robots, including one for making Japanese flower arrangements; voice-controlled robot helicopters (hovering is a "balancing act" rather similar to the inverted pendulum problem); rehabilitation robotics to provide patient-specific solutions (e.g. to control heart rate and blood pressure ); control of flow of powders in film manufacture; elevator systems; and so on. Work on fuzzy systems is also proceeding in North America and Europe, although on a less extensive scale than in Japan. The US Environmental Protection Agency has investigated fuzzy control for energy-efficient motors, and NASA has studied fuzzy control for automated space docking: simulations show that a fuzzy control system can greatly reduce fuel consumption. Firms such as Boeing, General Motors, Allen-Bradley, Chrysler, Eaton, and Whirlpool have worked on fuzzy logic for use in low-power refrigerators, improved automotive transmissions, and energy-efficient electric motors. In 1995 Maytag introduced an "intelligent" dishwasher based on a fuzzy controller and a "one-stop sensing module" that combines a thermistor, for temperature measurement; a conductivity sensor, to measure detergent level from the ions present in the wash; a turbidity sensor that measures scattered and transmitted light to measure the soiling of the wash; and a magnetostrictive sensor to read spin rate. The system determines the optimum wash cycle for any load to obtain the best results with the least amount of energy, detergent, and water. It even adjusts for dried-on foods by tracking the last time the door was opened, and estimates the number of dishes by the number of times the door was opened. Xiera Technologies Inc. has developed the first auto-tuner for the fuzzy logic controller's knowledge base known as edeX. This technology was tested by Mohawk College and was able to solve non-linear 2x2 and 3x3 multi-input multi-output problems. Research and development is also continuing on fuzzy applications in software, as opposed to firmware, design, including fuzzy expert systems and integration of fuzzy logic with neural-network and so-called adaptive "genetic" software systems, with the ultimate goal of building "self-learning" fuzzy-control systems. These systems can be employed to control complex, nonlinear dynamic plants, for example, human body. == Fuzzy sets == The input variables in a fuzzy control system are in general mapped by sets of membership functions similar to this, known as "fuzzy sets". The process of converting a crisp input value to a fuzzy value is called "fuzzification". The fuzzy logic based approach had been considered by designing two fuzzy systems, one for error heading angle and the other for velocity control. A control system may also have various types of switch, or "ON-OFF", inputs along with its analog inputs, and such switch inputs of course will always have a truth value equal to either 1 or 0, but the scheme can deal with them as simplified fuzzy functions that happen to be either one value or another. Given "mappings" of input variables into membership functions and truth values, the microcontroller then makes decisions for what action to take, based on a set of "rules", each of the form: IF brake temperature IS warm AND speed IS not very fast THEN brake pressure IS slightly decreased. In this example, the two input variables are "brake temperature" and "speed" that have values defined as fuzzy sets. The output variable, "brake pressure" is also defined by a fuzzy set that can have values like "static" or "slightly increased" or "slightly decreased" etc. === Fuzzy control in detail === Fuzzy controllers are very simple conceptually. They consist of an input stage, a processing stage, and an output stage. The input stage maps sensor or other inputs, such as switches, thumbwheels, and so on, to the appropriate membership functions and truth values. The processing stage invokes each appropriate rule and generates a result for each, then combines the results of the rules. Finally, the output stage converts the combined result back into a specific control output value. The most common shape of membership functions is triangular, although trapezoidal and bell curves are also used, but the shape is generally less important than the number of curves and their placement. From three to seven curves are generally appropriate to cover the required range of an input value, or the "universe of discourse" in fuzzy jargon. As discussed earlier, the processing stage is based on a collection of logic rules in the form of IF-THEN statements, where the IF part is called the "antecedent" and the THEN part is called the "consequent". Typical fuzzy
Read more →
Painworth

PainWorth is a justice, legal and insurance services application founded by Canadian entrepreneurs Mike Zouhri, Chris Trudel and Ryan Bencic. The application is a "robot lawyer" that uses artificial intelligence to automate personal injury claims for injury victims. It is currently available in Canada and the United States. PainWorth has been featured by several news outlets, including CTV, Global News, CBC, and has also been featured by the American Bar Association and LexisNexis for its role addressing social issues such as access to justice and other systemic issues in the legal and insurance industry. == Application == PainWorth began as a tool for calculating non-pecuniary damages for injury victims but has since expanded beyond a personal injury calculator to include features that help injury victims and business users with pecuniary damages, economic calculations, prescribed rates and providing informational guides to help navigate settlement negotiation, managing claims records and other issues encountered by self-represented litigants or claims managers. The platform makes use of automation to provide free user-guided calculations, steps and processes to successfully settle an injury claim. The application is supported by Microsoft Azure. == Personal Injury Calculator == PainWorth is the first service to use Artificial Intelligence to interpret case law in order to determine the value of pain and suffering incurred by specific injury types and injury severities. The cited case law is used as evidence and presented in statistical models to determine an accurate valuation compliant with the jurisdiction, regulatory rules and case complexities. == General Damages Calculator == PainWorth also offers a personal injury settlement calculator that assesses general damages based on specific case complexities and jurisdiction. The service takes into account medical complications and recovery in order to calculate the fair valuation. == Injury Settlement Platform == PainWorth insurance settlement platform facilitates a direct and automated way resolution center to settle cases for their assessed value without enduring the hardship of litigation. In 2021, Painworth won the title of World's Best Emerging Insurance Product for the development of this platform. == History == In 2019, Mike Zouhri was struck by a drunk driver which left him seriously injured and resulted in a lawsuit. Frustrated by the slow and expensive process, Zouhri went down to the law library and learned how to manage injury claims. After learning the process, he partnered lawyers and legal advisors to create an app to allow users to quickly settle their own injury claims fairly and accurately. Immediately after its launch, PainWorth quickly became widely used by thousands of users and gained significant media coverage. Global News reported that the bot had successfully helped people with more than $10 million in claims in only a few short months, all free of charge. In July 2020, PainWorth began raising concern over injustices and gender bias in the legal system. in Canadian courts.
Read more →
Elasticity (computing)

In computing, elasticity is defined as "the degree to which a system is able to adapt to workload changes by provisioning and de-provisioning resources in an autonomic manner, such that at each point in time the available resources match the current demand as closely as possible". Elasticity is a defining characteristic that differentiates cloud computing from previously proposed distributed computing paradigms, such as grid computing. The dynamic adaptation of capacity, e.g., by altering the use of computing resources, to meet a varying workload is called "elastic computing". In the world of distributed systems, there are several definitions according to the authors; some consider the concepts of scalability a sub-part of elasticity, others as being distinct. == Purpose == Elasticity aims to match the amount of resources allocated to a service with the amount of resources it actually requires, avoiding over- or under-provisioning. Over-provisioning, i.e., allocating more resources than required, should be avoided as it may incur extra costs (monetary, energy, operational, etc.) for unused or underutilized resources. For example, if a website is over-provisioned with two cloud computing resources to handle current demand that only requires one resource, the costs of maintaining the second resource would effectively be wasted. Under-provisioning, i.e., allocating fewer resources than required, must be avoided; otherwise, the service cannot serve its users with a good service. For example, under-provisioning a website may make it seem slow or unreachable, because not enough resources have been allocated to meet current demand. == Example == Elasticity can be illustrated through an example of a service provider who wants to run a website on the cloud. At moment t 0 {\displaystyle t_{0}} , the website is unpopular and a single machine is sufficient to serve all users. At moment t 1 {\displaystyle t_{1}} , the website suddenly becomes popular, and a single machine is no longer sufficient to serve all users. Based on the number of web users simultaneously accessing the website and the resource requirements of the web server, ten machines are needed. An elastic system should immediately detect this condition and provision nine additional machines from the cloud to serve all users responsively. At time t 2 {\displaystyle t_{2}} , the website becomes unpopular again. The ten machines currently allocated to the website are mostly idle and a single machine would be sufficient to serve the few users who are accessing the website. An elastic system should immediately detect this condition and deprovision nine machines, releasing them to the cloud. == Problems == === Resource provisioning time === Resource provisioning takes time. A cloud virtual machine (VM) can be acquired at any time by the user; however, it may take up to several minutes for the acquired VM to be ready to use. The VM startup time is dependent on factors such as image size, VM type, data center location, number of VMs, etc. Cloud providers have different VM startup performance. This implies that any control mechanism designed for elastic applications must consider the time needed for the resource provisioning actions to take effect. === Monitoring elastic applications === Elastic applications can allocate and deallocate resources on demand for specific application components. This makes cloud resources volatile, and traditional monitoring tools which associate monitoring data with a particular resource, such as Ganglia or Nagios, are no longer suitable for monitoring the behavior of elastic applications. For example, during its lifetime, a data storage tier of an elastic application might add and remove data storage VMs due to cost and performance requirements, varying the number of used VMs. Thus, additional information is needed in monitoring elastic applications, such as associating the logical application structure over the underlying virtual infrastructure. This in turn generates other problems, such as data aggregation from multiple VMs towards extracting the behavior of the application component running on top of those VMs, as different metrics may need to be aggregated differently (e.g., CPU usage could be averaged, network transfer might be summed up). === Stakeholder requirements === When deploying applications in cloud infrastructures (IaaS/PaaS), stakeholder requirements need to be considered in order to ensure that elastic behavior meets stakeholder needs. Traditionally, the optimal trade-off between cost and quality or performance is considered; however, for real world cloud users, requirements regarding elastic behavior are more complex and target multiple dimensions of elasticity (e.g., SYBL). === Multiple levels of control === Cloud applications vary in type and complexity, with multiple levels of artifacts deployed in layers. Controlling such structures must take into consideration a variety of issues. For multi-level control, control systems need to consider the impact lower level control has upon higher level ones, and vice versa (e.g., controlling virtual machines, web containers, or web services in the same time), as well as conflicts that may appear between various control strategies from various levels. Elastic strategies on in cloud computing can take advantage of control-theoretic methods (e.g., predictive control has been experimented in cloud computing scenarios by showing considerable advantages with respect to reactive methods). One approach to multi-level elastic clouc control is rSYBL.
Read more →
Type-1 OWA operators

Type-1 OWA operators are a set of aggregation operators that generalise the Yager's OWA (ordered weighted averaging) operators in the interest of aggregating fuzzy sets rather than crisp values in soft decision making and data mining. These operators provide a mathematical technique for directly aggregating uncertain information with uncertain weights via OWA mechanism in soft decision making and data mining, where these uncertain objects are modelled by fuzzy sets. The two definitions for type-1 OWA operators are based on Zadeh's Extension Principle and α {\displaystyle \alpha } -cuts of fuzzy sets. The two definitions lead to equivalent results. == Definitions == === Definition 1 === Let F ( X ) {\displaystyle F(X)} be the set of fuzzy sets with domain of discourse X {\displaystyle X} , a type-1 OWA operator is defined as follows: Given n linguistic weights { W i } i = 1 n {\displaystyle \left\{{W^{i}}\right\}_{i=1}^{n}} in the form of fuzzy sets defined on the domain of discourse U = [ 0 , 1 ] {\displaystyle U=[0,1]} , a type-1 OWA operator is a mapping, Φ {\displaystyle \Phi } , Φ : F ( X ) × ⋯ × F ( X ) ⟶ F ( X ) {\displaystyle \Phi \colon F(X)\times \cdots \times F(X)\longrightarrow F(X)} ( A 1 , ⋯ , A n ) ↦ Y {\displaystyle (A^{1},\cdots ,A^{n})\mapsto Y} such that μ Y ( y ) = sup ∑ k = 1 n w ¯ i a σ ( i ) = y ( μ W 1 ( w 1 ) ∧ ⋯ ∧ μ W n ( w n ) ∧ μ A 1 ( a 1 ) ∧ ⋯ ∧ μ A n ( a n ) ) {\displaystyle \mu _{Y}(y)=\displaystyle \sup _{\displaystyle \sum _{k=1}^{n}{\bar {w}}_{i}a_{\sigma (i)}=y}\left({\begin{array}{{1}l}\mu _{W^{1}}(w_{1})\wedge \cdots \wedge \mu _{W^{n}}(w_{n})\wedge \mu _{A^{1}}(a_{1})\wedge \cdots \wedge \mu _{A^{n}}(a_{n})\end{array}}\right)} where w ¯ i = w i ∑ i = 1 n w i {\displaystyle {\bar {w}}_{i}={\frac {w_{i}}{\sum _{i=1}^{n}{w_{i}}}}} , and σ : { 1 , ⋯ , n } ⟶ { 1 , ⋯ , n } {\displaystyle \sigma \colon \{1,\cdots ,n\}\longrightarrow \{1,\cdots ,n\}} is a permutation function such that a σ ( i ) ≥ a σ ( i + 1 ) , ∀ i = 1 , ⋯ , n − 1 {\displaystyle a_{\sigma (i)}\geq a_{\sigma (i+1)},\ \forall i=1,\cdots ,n-1} , i.e., a σ ( i ) {\displaystyle a_{\sigma (i)}} is the i {\displaystyle i} th highest element in the set { a 1 , ⋯ , a n } {\displaystyle \left\{{a_{1},\cdots ,a_{n}}\right\}} . === Definition 2 === Using the alpha-cuts of fuzzy sets: Given the n linguistic weights { W i } i = 1 n {\displaystyle \left\{{W^{i}}\right\}_{i=1}^{n}} in the form of fuzzy sets defined on the domain of discourse U = [ 0 , 1 ] {\displaystyle U=[0,\;\;1]} , then for each α ∈ [ 0 , 1 ] {\displaystyle \alpha \in [0,\;1]} , an α {\displaystyle \alpha } -level type-1 OWA operator with α {\displaystyle \alpha } -level sets { W α i } i = 1 n {\displaystyle \left\{{W_{\alpha }^{i}}\right\}_{i=1}^{n}} to aggregate the α {\displaystyle \alpha } -cuts of fuzzy sets { A i } i = 1 n {\displaystyle \left\{{A^{i}}\right\}_{i=1}^{n}} is: Φ α ( A α 1 , … , A α n ) = { ∑ i = 1 n w i a σ ( i ) ∑ i = 1 n w i | w i ∈ W α i , a i ∈ A α i , i = 1 , … , n } {\displaystyle \Phi _{\alpha }\left({A_{\alpha }^{1},\ldots ,A_{\alpha }^{n}}\right)=\left\{{{\frac {\sum \limits _{i=1}^{n}{w_{i}a_{\sigma (i)}}}{\sum \limits _{i=1}^{n}{w_{i}}}}\left|{w_{i}\in W_{\alpha }^{i},\;a_{i}}\right.\in A_{\alpha }^{i},\;i=1,\ldots ,n}\right\}} where W α i = { w | μ W i ( w ) ≥ α } , A α i = { x | μ A i ( x ) ≥ α } {\displaystyle W_{\alpha }^{i}=\{w|\mu _{W_{i}}(w)\geq \alpha \},A_{\alpha }^{i}=\{x|\mu _{A_{i}}(x)\geq \alpha \}} , and σ : { 1 , ⋯ , n } → { 1 , ⋯ , n } {\displaystyle \sigma :\{\;1,\cdots ,n\;\}\to \{\;1,\cdots ,n\;\}} is a permutation function such that a σ ( i ) ≥ a σ ( i + 1 ) , ∀ i = 1 , ⋯ , n − 1 {\displaystyle a_{\sigma (i)}\geq a_{\sigma (i+1)},\;\forall \;i=1,\cdots ,n-1} , i.e., a σ ( i ) {\displaystyle a_{\sigma (i)}} is the i {\displaystyle i} th largest element in the set { a 1 , ⋯ , a n } {\displaystyle \left\{{a_{1},\cdots ,a_{n}}\right\}} . == Representation theorem of Type-1 OWA operators == Given the n linguistic weights { W i } i = 1 n {\displaystyle \left\{{W^{i}}\right\}_{i=1}^{n}} in the form of fuzzy sets defined on the domain of discourse U = [ 0 , 1 ] {\displaystyle U=[0,\;\;1]} , and the fuzzy sets A 1 , ⋯ , A n {\displaystyle A^{1},\cdots ,A^{n}} , then we have that Y = G {\displaystyle Y=G} where Y {\displaystyle Y} is the aggregation result obtained by Definition 1, and G {\displaystyle G} is the result obtained by in Definition 2. == Programming problems for Type-1 OWA operators == According to the Representation Theorem of Type-1 OWA Operators, a general type-1 OWA operator can be decomposed into a series of α {\displaystyle \alpha } -level type-1 OWA operators. In practice, this series of α {\displaystyle \alpha } -level type-1 OWA operators is used to construct the resulting aggregation fuzzy set. So we only need to compute the left end-points and right end-points of the intervals Φ α ( A α 1 , ⋯ , A α n ) {\displaystyle \Phi _{\alpha }\left({A_{\alpha }^{1},\cdots ,A_{\alpha }^{n}}\right)} . Then, the resulting aggregation fuzzy set is constructed with the membership function as follows: μ G ( x ) = ⋁ α : x ∈ Φ α ( A α 1 , ⋯ , A α n ) α ⁡ α {\displaystyle \mu _{G}(x)=\operatorname {\bigvee } \limits _{\alpha :x\in \Phi _{\alpha }\left({A_{\alpha }^{1},\cdots ,A_{\alpha }^{n}}\right)_{\alpha }}\alpha } For the left end-points, we need to solve the following programming problem: Φ α ( A α 1 , ⋯ , A α n ) − = min W α − i ≤ w i ≤ W α + i A α − i ≤ a i ≤ A α + i ⁡ ∑ i = 1 n w i a σ ( i ) / ∑ i = 1 n w i {\displaystyle \Phi _{\alpha }\left({A_{\alpha }^{1},\cdots ,A_{\alpha }^{n}}\right)_{-}=\operatorname {\min } \limits _{\begin{array}{l}W_{\alpha -}^{i}\leq w_{i}\leq W_{\alpha +}^{i}A_{\alpha -}^{i}\leq a_{i}\leq A_{\alpha +}^{i}\end{array}}\sum \limits _{i=1}^{n}{w_{i}a_{\sigma (i)}/\sum \limits _{i=1}^{n}{w_{i}}}} while for the right end-points, we need to solve the following programming problem: Φ α ( A α 1 , ⋯ , A α n ) + = max W α − i ≤ w i ≤ W α + i A α − i ≤ a i ≤ A α + i ⁡ ∑ i = 1 n w i a σ ( i ) / ∑ i = 1 n w i {\displaystyle \Phi _{\alpha }\left({A_{\alpha }^{1},\cdots ,A_{\alpha }^{n}}\right)_{+}=\operatorname {\max } \limits _{\begin{array}{l}W_{\alpha -}^{i}\leq w_{i}\leq W_{\alpha +}^{i}A_{\alpha -}^{i}\leq a_{i}\leq A_{\alpha +}^{i}\end{array}}\sum \limits _{i=1}^{n}{w_{i}a_{\sigma (i)}/\sum \limits _{i=1}^{n}{w_{i}}}} A fast method has been presented to solve two programming problem so that the type-1 OWA aggregation operation can be performed efficiently, for details, please see the paper. == Alpha-level approach to Type-1 OWA operation == Three-step process: Step 1—To set up the α {\displaystyle \alpha } - level resolution in [0, 1]. Step 2—For each α ∈ [ 0 , 1 ] {\displaystyle \alpha \in [0,1]} , Step 2.1—To calculate ρ α + i 0 ∗ {\displaystyle \rho _{\alpha +}^{i_{0}^{\ast }}} Let i 0 = 1 {\displaystyle i_{0}=1} ; If ρ α + i 0 ≥ A α + σ ( i 0 ) {\displaystyle \rho _{\alpha +}^{i_{0}}\geq A_{\alpha +}^{\sigma (i_{0})}} , stop, ρ α + i 0 {\displaystyle \rho _{\alpha +}^{i_{0}}} is the solution; otherwise go to Step 2.1-3. i 0 ← i 0 + 1 {\displaystyle i_{0}\leftarrow i_{0}+1} , go to Step 2.1-2. Step 2.2 To calculate ρ α − i 0 ∗ {\displaystyle \rho _{\alpha -}^{i_{0}^{\ast }}} Let i 0 = 1 {\displaystyle i_{0}=1} ; If ρ α − i 0 ≥ A α − σ ( i 0 ) {\displaystyle \rho _{\alpha -}^{i_{0}}\geq A_{\alpha -}^{\sigma (i_{0})}} , stop, ρ α − i 0 {\displaystyle \rho _{\alpha -}^{i_{0}}} is the solution; otherwise go to Step 2.2-3. i 0 ← i 0 + 1 {\displaystyle i_{0}\leftarrow i_{0}+1} , go to step Step 2.2-2. Step 3—To construct the aggregation resulting fuzzy set G {\displaystyle G} based on all the available intervals [ ρ α − i 0 ∗ , ρ α + i 0 ∗ ] {\displaystyle \left[{\rho _{\alpha -}^{i_{0}^{\ast }},\;\rho _{\alpha +}^{i_{0}^{\ast }}}\right]} : μ G ( x ) = ⋁ α : x ∈ [ ρ α − i 0 ∗ , ρ α + i 0 ∗ ] ⁡ α {\displaystyle \mu _{G}(x)=\operatorname {\bigvee } \limits _{\alpha :x\in \left[{\rho _{\alpha -}^{i_{0}^{\ast }},\;\rho _{\alpha +}^{i_{0}^{\ast }}}\right]}\alpha } == Some Examples == The type-1 OWA operator with the weights shown in the top figure is used to aggregate the fuzzy sets (solide lines) in the bottom figure, and the dashed line is the aggregation result. == Special cases == Any OWA operators, like maximum, minimum, mean operators; Join operators of (type-1) fuzzy sets, i.e., fuzzy maximum operators; Meet operators of (type-1) fuzzy sets, i.e., fuzzy minimum operators; Join-like operators of (type-1) fuzzy sets; Meet-like operators of (type-1) fuzzy sets. == Generalizations == Type-2 OWA operators have been suggested to aggregate the type-2 fuzzy sets for soft decision making. == Applications == Type-1 OWA operators have been applied to different domains for soft decision making. Improved efficiency of computing approach ; Type reduction of type-2 fuzzy sets ; Group decision making ; Credit risk evaluation ; Information fusion ; Linguistic expressions and symbolic translation ; Sentiment analysis ; Ro
Read more →
Pommerman Challenge

The Pommerman Challenge is a multi-agent game to test autonomous artificial intelligence systems. == Game structure == Two-agent team compete against each other on an 11 x 11 board. Each agent can observe only part of the board, and the agents cannot communicate. The goal is to knock down the opponents. Agents place explosives to destroy walls and collect power-ups that appear from those walls, while avoiding death. Game objects can move unpredictably or be moved by an agent. == Play == The game involves real-time decision making. Agents must choose moves in about .1 seconds. == Algorithms == The real-time requirement limits the use of compute-heavy techniques such as Monte Carlo tree search. The branching factor at each move can be as large as 1,296, because all four agents act in each step, choosing among six possibilities. The agents choose by accounting for explosions, which have lifetimes of 10 steps. Explosions derail tree search techniques, as searches with less than 10 levels ignore explosions while deeper searches consider too many choices (given the branching factor). A hybrid approach uses a limited-depth tree search followed by exploring a deterministic/pessimistic scenario. Limiting the depth keeps the search tree small. The deterministic approach can predict far in the future, by omitting branching. "Good" actions are often those that perform well under pessimistic scenarios, particularly if safety is important. Identifying the worst sequence of positions for an object can suggest where to move it. After generating pessimistic scenarios, the agent quantifies the survivability of each move, notionally the number of positions in which the agent can then remain safely (without encountering other agents). == Competitions == 3 competitions were organized with slightly changing rules during 2018–2019. === Online - FFA === This round was a warm-up online event, where each competitor controlled only one agent. Results: 1st: Agent47Agent by Yichen Gong 2nd: aiKiller by Márton Görög === NeurIPS 2018 - Team === The first Pommerman competition with in-person finals. Results: 1st: hakozakijunctions by Toshihiro Takahashi 2nd: eisenach by Márton Görög 3rd: dypm by Takayuki Osogami The 3 best performing solutions used online tree search. === NeurIPS 2019 - Team Radio === The second competition with in-person finals improved communication between teammate agents. Results: 1st: Márton Görög 2nd: Paul Jasek 3rd: Yifan Zhang
Read more →
Willy's Chocolate Experience

Willy's Chocolate Experience was an unlicensed event based on Charlie and the Chocolate Factory that took place in Glasgow, Scotland, in February 2024. The event was promoted as an immersive and interactive family experience, illustrated on a promotional website with "dreamlike" AI-generated images. Once it was discovered that the event was held in a sparsely decorated warehouse, many customers complained, and the police were called to the venue. The event went viral on the Internet and attracted worldwide media attention. The event drew comparisons to the 2008 Lapland New Forest controversy, the 2014 Tumblr fan convention DashCon, and Billy McFarland's 2017 Fyre Festival. == Background and advertising == The event was stated to take place over the weekend of 24–25 February 2024. Promotional material advertised "stunning and intricately designed settings inspired by Roald Dahl's timeless tale" and "an array of delectable treats scattered throughout the experience". Both the website and promotional material used poor-quality AI-generated images, which included several spelling errors such as "cartchy tuns" and "a pasadise of sweet teats" and nonsensical words such as "catgacating" and "exarserdray". Tickets cost up to £35 per person. While the event was being promoted in early February, a Reddit user who saw Facebook advertisements suspected it to be a scam and was surprised that people were apparently buying tickets based solely on AI-generated images. The event was organised by House of Illuminati, a company registered to Billy Coull which claimed to offer "unparalleled immersive experiences". An investigation by Third Force News conducted after the event described Coull's previous "murky involvement in the charity sector." Coull had previously registered several other companies and claimed to work as a "consultant" for the now-defunct brand Empowerity, formerly known as the charity Gowanbank Community Hub. In 2021, Gowanbank was forced to remove claims of a £95-per-ticket fundraising "gala" at DoubleTree Glasgow which had been falsely advertised to feature TV personalities and performers including Gok Wan and Joe Black. Coull had claimed to be a doctor with a fake degree from a false university that provided "metaphysical degrees", and had attempted to use the charity to win the 2022 Glasgow City Council election in the seat of Greater Pollok, though he never registered for the election. In the summer of 2023, he independently published 17 AI-generated books on various topics, including vaccine conspiracy theories. Rolling Stone concluded that House of Illuminati's websites and event descriptions were likely written by an AI chatbot, such as ChatGPT. Three actors were hired to portray "Willy McDuff", a character based on Willy Wonka. One of them, Paul Connell, said that the cast were given one day to learn the script. Another actor playing Willy McDuff was 18-year-old Michael Archibald; the experience was his first ever acting job, and he was given the script at 6 pm on Friday before the event began on Saturday. Kirsty Paterson, an actress who played one of the Oompa-Loompas (called "Wonkidoodles" in the script), said that the job offer had been posted on Indeed.com and offered £500 for two days of work. The day before the event, the actors attended a dress rehearsal at the sparsely decorated venue. They were told that others would be working through the night on the production. When they returned on the day of the event, the venue was in the same condition. Paterson was given her costume an hour before the event opened, saying that "We were just handed an Amazon box that probably arrived that morning." == Script == The script for the event is titled Wonkidoodles at McDuff's Chocolate Factory: A Script, and describes Willy McDuff leading an audience through the Garden of Enchantment and the Twilight Tunnel. Once there, they are confronted by a character called The Unknown, described as "an evil chocolate maker who lives in the walls" who seeks to steal the magical "Anti-Graffiti Gobstopper" from McDuff's Imagination Lab. The gobstopper is "a sweet so powerful, it can make any room sparkle without lifting a finger". McDuff defeats The Unknown by amplifying the power of the gobstopper and causing his enemy to be "gently swept up by a robotic vacuum, humorously ending the confrontation". The script was unusual in that it included stage directions for the audience, and descriptions of their reactions. Connell described it as "15 pages of AI-generated gibberish of me just monologuing these mad things", and compared the vacuum cleaner plot point to that of the Nintendo video game Luigi's Mansion. Interviewed after the event, Coull claimed to have written the script himself, using AI only to "check spelling, grammar, and continuity" as he said he had dyslexia. == Event == The event was held at the Box Hub Warehouse event space in Whiteinch, an industrial area of Glasgow. Customers described the venue as "little more than an abandoned, empty warehouse", with set dressings including a small bouncy castle, AI-generated backdrop images pinned to some of the walls, and props which were "strewn about on bare concrete floors". The venue's windows were dirty and its air conditioning systems were left exposed. Paterson has stated that by the time she saw the venue, she had already signed her contract and "didn't want to disappoint the kids", and thus chose to proceed with the work. The Unknown was played by a 16-year-old actress named Felicia Dawkins, who wore a silver mask and a black cloak. Young children were frightened by the character, who appeared from behind a large rectangular mirror. Despite the script calling for The Unknown to be defeated with a vacuum cleaner, no such prop was provided, and actors were instead asked to improvise. Connell said that he and other employees were told to give each child "two jelly beans and a quarter of a cup of lemonade", although the limited supply of jelly beans quickly ran out. Paterson and another "Wonkidoodle" actress, Jenny Fogarty, said that after the first three 45-minute performances, the cast were told to abandon the script and instead let guests walk through the venue, a process that Paterson said took "about two minutes". The character of The Unknown, previously introduced as the main antagonist, was now "scaring children for no reason". One of the actors playing McDuff improvised the idea that children should pull a "silly face" at The Unknown to scare them away, but Dawkins said that, in other cases, she "just had to awkwardly walk back to my corner". Connell was told he would be given a 15-minute break every 45 minutes, but on the day of the event, he played Willy McDuff for three and a half hours without a break. After returning from a lunch break, Connell encountered a crowd of customers demanding refunds from Coull, and the other actors were unsure what to do next. After being told that the event was now cancelled halfway through its opening day, the actors left and went to a pub. Upon returning to the venue some time later, Connell said that he felt "the threat of violence had become quite high" and that there were two police vans and two squad cars at the scene. == Customer reviews and response == Willy's Chocolate Experience was widely criticised by those who attended it, many of whom demanded refunds. One customer, who had driven with his children for two hours to reach the event, described it as an "absolute con". Other visitors who arrived after the event was closed and were not informed of its cancellation requested compensation for wasted rail fares. Following the event's cancellation, Coull offered to refund 850 people, a statement repeated by the event's Facebook page. Some Facebook users stated that they had received their money back. Paterson and Fogarty stated that they only received half of their paycheque. Box Hub, the organisation that had rented the warehouse to House of Illuminati, issued an apology on House of Illuminati's behalf, stating that they "either have no regards for the families and young children they have disappointed or are too embarrassed to comment", and offered to provide a venue free of charge for those who attended the event. House of Illuminati later stated that they would not host any future events. Coull deleted his LinkedIn profile, his YouTube channel, and his personal website in response to the controversy. A few days after the event, Connell said he felt that Coull was "probably one of the most disliked people in Glasgow right now". In an interview with The Sunday Times, Coull apologised for how the event turned out, saying he would accept responsibility. == Fundraising == In an interview with Wired magazine, Connell stated that he and the other actors were working with parents to provide a free show for the children who attended. Some items from the event were later auctioned for charity. The venue auctioned the leftover hand-written "even
Read more →
Microscope image processing

Microscope image processing is a broad term that covers the use of digital image processing techniques to process, analyze and present images obtained from a microscope. Such processing is now commonplace in a number of diverse fields such as medicine, biological research, cancer research, drug testing, metallurgy, etc. A number of manufacturers of microscopes now specifically design in features that allow the microscopes to interface to an image processing system. == Image acquisition == Until the early 1990s, most image acquisition in video microscopy applications was typically done with an analog video camera, often simply closed circuit TV cameras. While this required the use of a frame grabber to digitize the images, video cameras provided images at full video frame rate (25-30 frames per second) allowing live video recording and processing. While the advent of solid state detectors yielded several advantages, the real-time video camera was actually superior in many respects. Today, acquisition is usually done using a CCD camera mounted in the optical path of the microscope. The camera may be full colour or monochrome. Very often, very high resolution cameras are employed to gain as much direct information as possible. Cryogenic cooling is also common, to minimise noise. Often digital cameras used for this application provide pixel intensity data to a resolution of 12-16 bits, much higher than is used in consumer imaging products. Ironically, in recent years, much effort has been put into acquiring data at video rates, or higher (25-30 frames per second or higher). What was once easy with off-the-shelf video cameras now requires special, high speed electronics to handle the vast digital data bandwidth. Higher speed acquisition allows dynamic processes to be observed in real time, or stored for later playback and analysis. Combined with the high image resolution, this approach can generate vast quantities of raw data, which can be a challenge to deal with, even with a modern computer system. While current CCD detectors allow very high image resolution, often this involves a trade-off because, for a given chip size, as the pixel count increases, the pixel size decreases. As the pixels get smaller, their well depth decreases, reducing the number of electrons that can be stored. In turn, this results in a poorer signal-to-noise ratio. For best results, one must select an appropriate sensor for a given application. Because microscope images have an intrinsic limiting resolution, it often makes little sense to use a noisy, high resolution detector for image acquisition. A more modest detector, with larger pixels, can often produce much higher quality images because of reduced noise. This is especially important in low-light applications such as fluorescence microscopy. Moreover, one must also consider the temporal resolution requirements of the application. A lower resolution detector will often have a significantly higher acquisition rate, permitting the observation of faster events. Conversely, if the observed object is motionless, one may wish to acquire images at the highest possible spatial resolution without regard to the time required to acquire a single image. == 2D image techniques == Image processing for microscopy application begins with fundamental techniques intended to most accurately reproduce the information contained in the microscopic sample. This might include adjusting the brightness and contrast of the image, averaging images to reduce image noise and correcting for illumination non-uniformities. Such processing involves only basic arithmetic operations between images (i.e. addition, subtraction, multiplication and division). The vast majority of processing done on microscope image is of this nature. Another class of common 2D operations called image convolution are often used to reduce or enhance image details. Such "blurring" and "sharpening" algorithms in most programs work by altering a pixel's value based on a weighted sum of that and the surrounding pixels (a more detailed description of kernel based convolution deserves an entry for itself) or by altering the frequency domain function of the image using Fourier Transform. Most image processing techniques are performed in the Frequency domain. Other basic two dimensional techniques include operations such as image rotation, warping, color balancing etc. At times, advanced techniques are employed with the goal of "undoing" the distortion of the optical path of the microscope, thus eliminating distortions and blurring caused by the instrumentation. This process is called deconvolution, and a variety of algorithms have been developed, some of great mathematical complexity. The end result is an image far sharper and clearer than could be obtained in the optical domain alone. This is typically a 3-dimensional operation, that analyzes a volumetric image (i.e. images taken at a variety of focal planes through the sample) and uses this data to reconstruct a more accurate 3-dimensional image. == 3D image techniques == Another common requirement is to take a series of images at a fixed position, but at different focal depths. Since most microscopic samples are essentially transparent, and the depth of field of the focused sample is exceptionally narrow, it is possible to capture images "through" a three-dimensional object using 2D equipment like confocal microscopes. Software is then able to reconstruct a 3D model of the original sample which may be manipulated appropriately. The processing turns a 2D instrument into a 3D instrument, which would not otherwise exist. In recent times this technique has led to a number of scientific discoveries in cell biology. == Analysis == Analysis of images will vary considerably according to application. Typical analysis includes determining where the edges of an object are, counting similar objects, calculating the area, perimeter length and other useful measurements of each object. A common approach is to create an image mask which only includes pixels that match certain criteria, then perform simpler scanning operations on the resulting mask. It is also possible to label objects and track their motion over a series of frames in a video sequence.
Read more →
Land of Memories

Land of Memories (Chinese: 机忆之地) is a Chinese science-fiction novel by Shen Yang (沈阳), a professor at Tsinghua University's School of Journalism and Communication. The story revolves around a former neuroscientist trying to recover her memories from the metaverse after suffering amnesia due to an accident. It contains almost 6,000 Chinese characters and was shortened from an AI-generated draft that was 43,000 characters long. The process involved 66 prompts spanning almost three hours. The novel was among 18 submissions that won the level-two prize at the Fifth Jiangsu Youth Science Education and Science Fiction Competition (第五届江苏省青年科普科幻作品大赛). The contest was restricted to participants between the age of 14 and 45 but did not forbid entries generated by AI. One of its organizers reached out to Shen after finding out that the professor had been experimenting with writing science fiction using AI. The judges were not told about the novel's origin in advance. Three of them, out of the six, approved the work. One judge, who had worked with AI models before, recognized that the novel was written by AI and criticized the work for lacking emotional appeal. The organizer who had contacted Shen said the novel's introduction was not bad but the story did not develop well. It would not meet the usual standards for publication. However, he still plans to allow AI-generated submissions in 2024. Fu Ruchu, editorial department director of the People's Literature Publishing House, said the novel was not easily identifiable as AI-generated and applauded its logical consistency. She warned that artificial intelligence could endanger the jobs of fiction writers and cause permanent damage to literary language.
Read more →
For a Breath I Tarry

"For a Breath I Tarry" is a 1966 post-apocalyptic novelette by American writer Roger Zelazny, which was nominated for the Hugo Award for Best Novelette in 1967. Set in a future long after the self-extinction of humanity, the novelette recounts the tale of Frost, a sentient machine. Although humans have caused their own extinction, the sentient machines that they created continue the work of rebuilding a shattered Earth. Along the way, the story explores the differences between humanity and machines, the former experiencing the world qualitatively, while the latter doing so quantitatively. This difference is illustrated through philosophical conversations between Frost and another machine named Mordel. Frost's goal of becoming human, along with literary allusions, drives the plot and sets the tone of the novelette. These allusions include the first chapter of the Book of Job, in both situation and language, since verses are both quoted directly and paraphrased. In addition, the first three chapters of the Book of Genesis are echoed. Finally, Frost and Mordel enter into a Faustian bargain, though with better results than in the original story. The other major character is the Beta Machine, Frost's peer in the Southern Hemisphere. (Frost controls the Northern Hemisphere.) The novelette hints that though being a machine, Beta has a feminine personality. After Frost has succeeded in his millennium-long quest to become human (via recovered DNA), Beta agrees to join him in becoming human—suggesting the possibility of rebirth for the human race. The novelette has appeared in collections of Zelazny's works and in anthologies. The title is from a phrase in the poet A. E. Housman's collection A Shropshire Lad.
Read more →
Batch normalization

In artificial neural networks, batch normalization (also known as batch norm) is a normalization technique used to make training faster and more stable by adjusting the inputs to each layer—re-centering them around zero and re-scaling them to a standard size. It was introduced by Sergey Ioffe and Christian Szegedy in 2015. Experts still debate why batch normalization works so well. It was initially thought to tackle internal covariate shift, a problem where parameter initialization and changes in the distribution of the inputs of each layer affect the learning rate of the network. However, newer research suggests it doesn’t fix this shift but instead smooths the objective function—a mathematical guide the network follows to improve—enhancing performance. In very deep networks, batch normalization can initially cause a severe gradient explosion—where updates to the network grow uncontrollably large—but this is managed with shortcuts called skip connections in residual networks. Another theory is that batch normalization adjusts data by handling its size and path separately, speeding up training. == Internal covariate shift == Each layer in a neural network has inputs that follow a specific distribution, which shifts during training due to two main factors: the random starting values of the network’s settings (parameter initialization) and the natural variation in the input data. This shifting pattern affecting the inputs to the network’s inner layers is called internal covariate shift. While a strict definition isn’t fully agreed upon, experiments show that it involves changes in the means and variances of these inputs during training. Batch normalization was first developed to address internal covariate shift. During training, as the parameters of preceding layers adjust, the distribution of inputs to the current layer changes accordingly, such that the current layer needs to constantly readjust to new distributions. This issue is particularly severe in deep networks, because small changes in shallower hidden layers will be amplified as they propagate within the network, resulting in significant shift in deeper hidden layers. Batch normalization was proposed to reduced these unwanted shifts to speed up training and produce more reliable models. Beyond possibly tackling internal covariate shift, batch normalization offers several additional advantages. It allows the network to use a higher learning rate—a setting that controls how quickly the network learns—without causing problems like vanishing or exploding gradients, where updates become too small or too large. It also appears to have a regularizing effect, improving the network’s ability to generalize to new data, reducing the need for dropout, a technique used to prevent overfitting (when a model learns the training data too well and fails on new data). Additionally, networks using batch normalization are less sensitive to the choice of starting settings or learning rates, making them more robust and adaptable. == Procedures == === Transformation === In a neural network, batch normalization is achieved through a normalization step that fixes the means and variances of each layer's inputs. Ideally, the normalization would be conducted over the entire training set, but to use this step jointly with stochastic optimization methods, it is impractical to use the global information. Thus, normalization is restrained to each mini-batch in the training process. Let us use B to denote a mini-batch of size m of the entire training set. The empirical mean and variance of B could thus be denoted as μ B = 1 m ∑ i = 1 m x i {\displaystyle \mu _{B}={\frac {1}{m}}\sum _{i=1}^{m}x_{i}} and σ B 2 = 1 m ∑ i = 1 m ( x i − μ B ) 2 {\displaystyle \sigma _{B}^{2}={\frac {1}{m}}\sum _{i=1}^{m}(x_{i}-\mu _{B})^{2}} . For a layer of the network with d-dimensional input, x = ( x ( 1 ) , . . . , x ( d ) ) {\displaystyle x=(x^{(1)},...,x^{(d)})} , each dimension of its input is then normalized (i.e. re-centered and re-scaled) separately, x ^ i ( k ) = x i ( k ) − μ B ( k ) ( σ B ( k ) ) 2 + ϵ {\displaystyle {\hat {x}}_{i}^{(k)}={\frac {x_{i}^{(k)}-\mu _{B}^{(k)}}{\sqrt {\left(\sigma _{B}^{(k)}\right)^{2}+\epsilon }}}} , where k ∈ [ 1 , d ] {\displaystyle k\in [1,d]} and i ∈ [ 1 , m ] {\displaystyle i\in [1,m]} ; μ B ( k ) {\displaystyle \mu _{B}^{(k)}} and σ B ( k ) {\displaystyle \sigma _{B}^{(k)}} are the per-dimension mean and standard deviation, respectively. ϵ {\displaystyle \epsilon } is added in the denominator for numerical stability and is an arbitrarily small positive constant. The resulting normalized activation x ^ ( k ) {\displaystyle {\hat {x}}^{(k)}} have zero mean and unit variance, if ϵ {\displaystyle \epsilon } is not taken into account. To restore the representation power of the network, a transformation step then follows as y i ( k ) = γ ( k ) x ^ i ( k ) + β ( k ) {\displaystyle y_{i}^{(k)}=\gamma ^{(k)}{\hat {x}}_{i}^{(k)}+\beta ^{(k)}} , where the parameters γ ( k ) {\displaystyle \gamma ^{(k)}} and β ( k ) {\displaystyle \beta ^{(k)}} are subsequently learned in the optimization process. Formally, the operation that implements batch normalization is a transform B N γ ( k ) , β ( k ) : x 1... m ( k ) → y 1... m ( k ) {\displaystyle BN_{\gamma ^{(k)},\beta ^{(k)}}:x_{1...m}^{(k)}\rightarrow y_{1...m}^{(k)}} called the Batch Normalizing transform. The output of the BN transform y ( k ) = B N γ ( k ) , β ( k ) ( x ( k ) ) {\displaystyle y^{(k)}=BN_{\gamma ^{(k)},\beta ^{(k)}}(x^{(k)})} is then passed to other network layers, while the normalized output x ^ i ( k ) {\displaystyle {\hat {x}}_{i}^{(k)}} remains internal to the current layer. === Backpropagation === The described BN transform is a differentiable operation, and the gradient of the loss l {\displaystyle l} with respect to the different parameters can be computed directly with the chain rule. Specifically, ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial y_{i}^{(k)}}}} depends on the choice of activation function, and the gradient against other parameters could be expressed as a function of ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial y_{i}^{(k)}}}} : ∂ l ∂ x ^ i ( k ) = ∂ l ∂ y i ( k ) γ ( k ) {\displaystyle {\frac {\partial l}{\partial {\hat {x}}_{i}^{(k)}}}={\frac {\partial l}{\partial y_{i}^{(k)}}}\gamma ^{(k)}} , ∂ l ∂ γ ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) x ^ i ( k ) {\displaystyle {\frac {\partial l}{\partial \gamma ^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}{\hat {x}}_{i}^{(k)}} , ∂ l ∂ β ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial \beta ^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}} , ∂ l ∂ σ B ( k ) 2 = ∑ i = 1 m ∂ l ∂ y i ( k ) ( x i ( k ) − μ B ( k ) ) ( − γ ( k ) 2 ( σ B ( k ) 2 + ϵ ) − 3 / 2 ) {\displaystyle {\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}(x_{i}^{(k)}-\mu _{B}^{(k)})\left(-{\frac {\gamma ^{(k)}}{2}}(\sigma _{B}^{(k)^{2}}+\epsilon )^{-3/2}\right)} , ∂ l ∂ μ B ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) − γ ( k ) σ B ( k ) 2 + ϵ + ∂ l ∂ σ B ( k ) 2 1 m ∑ i = 1 m ( − 2 ) ⋅ ( x i ( k ) − μ B ( k ) ) {\displaystyle {\frac {\partial l}{\partial \mu _{B}^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}{\frac {-\gamma ^{(k)}}{\sqrt {\sigma _{B}^{(k)^{2}}+\epsilon }}}+{\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}{\frac {1}{m}}\sum _{i=1}^{m}(-2)\cdot (x_{i}^{(k)}-\mu _{B}^{(k)})} , and ∂ l ∂ x i ( k ) = ∂ l ∂ x ^ i ( k ) 1 σ B ( k ) 2 + ϵ + ∂ l ∂ σ B ( k ) 2 2 ( x i ( k ) − μ B ( k ) ) m + ∂ l ∂ μ B ( k ) 1 m {\displaystyle {\frac {\partial l}{\partial x_{i}^{(k)}}}={\frac {\partial l}{\partial {\hat {x}}_{i}^{(k)}}}{\frac {1}{\sqrt {\sigma _{B}^{(k)^{2}}+\epsilon }}}+{\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}{\frac {2(x_{i}^{(k)}-\mu _{B}^{(k)})}{m}}+{\frac {\partial l}{\partial \mu _{B}^{(k)}}}{\frac {1}{m}}} . === Inference === During the training stage, the normalization steps depend on the mini-batches to ensure efficient and reliable training. However, in the inference stage, this dependence is not useful any more. Instead, the normalization step in this stage is computed with the population statistics such that the output could depend on the input in a deterministic manner. The population mean, E [ x ( k ) ] {\displaystyle E[x^{(k)}]} , and variance, Var ⁡ [ x ( k ) ] {\displaystyle \operatorname {Var} [x^{(k)}]} , are computed as: E [ x ( k ) ] = E B [ μ B ( k ) ] {\displaystyle E[x^{(k)}]=E_{B}[\mu _{B}^{(k)}]} , and Var ⁡ [ x ( k ) ] = m m − 1 E B [ ( σ B ( k ) ) 2 ] {\displaystyle \operatorname {Var} [x^{(k)}]={\frac {m}{m-1}}E_{B}[\left(\sigma _{B}^{(k)}\right)^{2}]} . The population statistics thus is a complete representation of the mini-batches. The BN transform in the inference step thus becomes y ( k ) = B N γ ( k ) , β ( k ) inf ( x ( k ) ) = γ ( k ) x ( k ) − E [ x ( k ) ] Var ⁡ [ x ( k ) ] + ϵ + β
Read more →
Text simplification

Text simplification is an aspect of natural language processing that involves modifying, organizing, or categorizing existing text to make it easier to understand while retaining its original meaning. This process is essential in today's world, where communication is increasingly complex due to advancements in science, technology, and media. Human languages are inherently intricate, with extensive vocabularies and complex structures that can be challenging for machines to handle efficiently. Researchers have found that semantic compression techniques can help streamline and simplify text by reducing linguistic diversity and simplifying the vocabulary used in a given context. == Example == Text simplification involves modifying complex sentences into simpler ones to enhance readability and comprehension. Siddharthan (2006) provides an example to illustrate this process. The original sentence contains multiple clauses and phrases, which can be broken down into simpler sentences for better understanding. Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents, which precedes the full purchasing agents report that is due out today and gives an indication of what the full report might hold. Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents. The Chicago report precedes the full purchasing agents report. The Chicago report gives an indication of what the full report might hold. The full report is due out today. An approach to text simplification involves lexical simplification via lexical substitution, a process that replaces complex words with simpler synonyms. Identifying complex words is a challenge addressed by machine learning classifiers trained on labeled data. Researchers have found that asking labelers to sort words by complexity levels yields more consistent results than the traditional method of categorizing words as simple or complex.
Read more →
The Life and Times of Multivac

"The Life and Times of Multivac" is a science fiction short story by American writer Isaac Asimov. The story first appeared in the 5 January 1975 issue of The New York Times Magazine, and was reprinted in the collections The Bicentennial Man and Other Stories and The Best of Creative Computing in 1976. It is one of a loosely connected series of stories concerning a fictional supercomputer called Multivac. "The Life and Times of Multivac" was the first piece of fiction ever commissioned and published by The New York Times. Asimov's original title for the story was "Mathematical Games", but after the story appeared under the new title he decided he liked it. In his commentary on the story in The Bicentennial Man and Other Stories collection, Asimov stated, "More people came up to me over the next few weeks to tell me they had read that story than had ever been the case for any other story I had ever written." == Plot summary == When humanity begins to chafe under Multivac’s benevolent tyranny, one man takes matters into his own hands to destroy the great computer. By appearing to betray his fellow humans, he places himself in a position to permanently destroy Multivac. It is implied that it is not until completion of the act that he and his peers suddenly realize the enormity of their actions and the consequences it will have on humanity.
Read more →
Theaitre

Theaitre (stylized as THEaiTRE) is an interdisciplinary research project investigating to what extent artificial intelligence is able to generate theatre play scripts. The first theatre play produced within the project, AI: When a Robot Writes a Play, premiered online on February 26, 2021. == Goal == Following similar previous projects such as Sunspring, a short sci-fi movie with an automatically generated script, the THEaiTRE project investigates whether current language generation approaches are mature enough to generate a theatre play script that could be successfully performed in front of an audience. The project falls within the area of generative art, famously represented e.g. by the portrait of Edmond de Belamy which was generated by an artificial neural network. In this field, artists are trying to use automated techniques to create "art", questioning the modern definition of art itself. More broadly, the project aims at promoting cooperation rather than competition of humans and artificial intelligence as the more beneficial approach for both. The first theatre play created within the project, titled AI: When a Robot Writes a Play, was presented in February 2021 at the 100th anniversary of the premiere of the R.U.R. theatre play by the Czech author Karel Čapek to celebrate the invention of the word "robot". While R.U.R. was a play written by a human about robots (and humans), THEaiTRE tried to reverse this idea by presenting a play written by a "robot" (artificial intelligence) about humans (and robots). The script of the play was published online, with marked parts of the text which were written manually or manually post-edited. The analysis shows that 90% of the script is automatically generated, with 10% manually written or manually post-edited. The project also plans to produce a second play in 2022, addressing some of the many shortcomings of the approach used to generate the first play, as well as attempting to further minimize the amount of human influence on the script. == Approach == At the core of the project is the GPT-2 language model by OpenAI with various adjustments motivated by the task of generating theatre play scripts, for which the model is not particularly trained. The GPT-2 model is used in the usual way, providing it with a start of a document and prompting it to generate a continuation of the document. Specifically, the input for GPT-2 in this project is typically a short description of the scene setting, followed by a few lines to introduce the characters and start the dialogue. The model then generates 10 continuation lines, and hands control to the user, who can then either ask the model to continue generating, or make various edits before letting the model to generate further, deleting some parts of the script or adding new lines into the script. The adjustments include restricting the generator to only produce lines pertaining to characters appearing in the input prompt, limiting the repetitiveness of the generated text, and employing automatic summarization of the input prompt and the generated text to overcome the limitation of the GPT-2 model which only attends to the last 1,024 subword tokens. The limitations of the model include, among other, a lack of distinctiveness and self-consistency of the characters, an inability to generate the script for the whole play (scripts for individual scenes are generated independently), and errors due to the employment of automated machine translation, as GPT-2 generates English texts but the final play script is being produced in Czech language. The source codes of the project are available under the MIT licence. The project has also published some sample outputs. == Team == The project is a cooperation of the following experts, all based in Prague, Czech Republic: computational linguists from the Faculty of Mathematics and Physics, Charles University theatre experts from the Švanda Theatre and from the Theatre Faculty of the Academy of Performing Arts in Prague hackers from CEE Hacks The project is financially supported by the Technology Agency of the Czech Republic.
Read more →