AI For Business Edinburgh

AI For Business Edinburgh — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • WebPlus

    WebPlus

    Serif WebPlus was a website design program for Microsoft Windows, developed by the software company, Serif. It allows users to design, create and upload their website onto the internet without any knowledge of HTML or other web technologies. Much like Microsoft Word, WebPlus uses WYSIWYG drag and drop editing to add and position text, images and links as they would appear on the finished web page. Once a user has designed their site, WebPlus can preview the site in a web browser before uploading the site using the in-built FTP. The software comes with a variety of pre-designed sample websites containing Filler text like Lorem ipsum, which can be used as a template for quickly designing a site. It also provides drawing tools for creating and editing buttons and web graphics. == Free WebPlus Starter Edition == Previously Serif had made available feature limited Starter Editions of their software, based on older versions, which could be obtained and used free of charge. For WebPlus the final free edition was based on version X5 and this was released in September 2012. This continued to be available from Serif's server until it was withdrawn around March 2016. WebPlus was then only available as a paid-for version X8. == Program Withdrawal == In March 2016, Serif announced that WebPlus X8 would be the final version, and that there were no current plans to design an application to replace it. Sales of WebPlus X8 by Serif were ended around December 2016. In early 2018, Serif announced that Serif Web Resources, hosted on Serif servers and required to implement some advanced web-site functionality in WebPlus created sites, would no longer work after 31 August 2018. In 2018, Serif also shutdown the servers that generated the "Plus" software registration numbers on-line from the product version and the individual generated installation number. Serif revealed the alternative was to use a universal master registration number, which is 881887. This is known to work with post 2003 Serif "Plus" software (e.g. verified to work with PagePlus v5.02). However, later Serif "Plus" software still registers itself automatically if within a certain recent period of a previous Serif software registration on the same PC. == Supported platforms == WebPlus was developed for Microsoft Windows "Win32" graphical desktop interface and is fully compatible with Windows XP, Windows Vista (32/64bit), Windows 7 (32/64bit) and Windows 8. == Features == Web hosting to upload websites to the internet with the address www.sitename.webplus.net and email [email protected]. E-Commerce tool to create online stores with providers such as PayPal. Form wizard generates online forms to collect information from website visitors. Add blogs, forums, hit counters, online polls and content management systems to websites using Smart Objects. Google Maps tool embeds maps and optional navigation markers within a website. Site navigation bars adopt a website's structure providing a tool for navigating around the website. Photo gallery groups a collection of images together and displays them as an animated slideshow. Search engine optimization (SEO) tools optimise a websites search ranking with the likes of Google, Yahoo! and Bing. Collect website metrics such as page popularity and number of website hits using Google Analytics. WebPlus X5 introduced a button studio for creating button graphics. Restrict access to specific pages on a website with a secure member's area. WebPlus automatically converts images and graphics into a web targeted format, optimising them for fast download. Embed YouTube videos within a web page. Add animated effects to a website with Animated GIFs, Animated Marquees or by importing Flash videos. Stream news and information feeds to a website using RSS and podcasts. Automated Site Checker analyses and corrects potential problems with a website. AdSense tool incorporates Google AdSense advertisements into a website In-built FTP transfers files onto a web server, uploading a website to the internet. In-built Basic Photo Editor the PhotoLab can make automatic adjustments and "Quick Fix's" to photos. From X5, WebPlus offers image editing and filters, through its PhotoLab and also provides a dedicated background-removal tool in the form of Cutout Studio. Display images, Flash videos and web pages using animated Lightboxes. Filter Effects can be applied to the graphical objects, giving convincing, realistic effects such as glass, metallic, plastic and other 2D/3D filters. WebPlus also provides QuickShapes for creating button and web graphics. These predefined shapes can be quickly modified with sliders to adjust certain parameters, for example creating rounded rectangles, etc. Shapes include: rectangles, ellipses, stars, spirals, cogs, petals, etc.

    Read more →
  • Knowledge graph embedding

    Knowledge graph embedding

    In representation learning, knowledge graph embedding (KGE), also called knowledge representation learning (KRL), or multi-relation learning, is a machine learning task of learning a low-dimensional representation of a knowledge graph's entities and relations while preserving their semantic meaning. Leveraging their embedded representation, knowledge graphs can be used for various applications such as link prediction, triple classification, entity recognition, clustering, and relation extraction. == Definition == A knowledge graph G = { E , R , F } {\displaystyle {\mathcal {G}}=\{E,R,F\}} is a collection of entities E {\displaystyle E} , relations R {\displaystyle R} , and facts F {\displaystyle F} . A fact is a triple ( h , r , t ) ∈ F {\displaystyle (h,r,t)\in F} that denotes a link r ∈ R {\displaystyle r\in R} between the head h ∈ E {\displaystyle h\in E} and the tail t ∈ E {\displaystyle t\in E} of the triple. Another notation that is often used in the literature to represent a triple (or fact) is ⟨ head , relation , tail ⟩ {\displaystyle \langle {\text{head}},{\text{relation}},{\text{tail}}\rangle } . This notation is called the Resource Description Framework (RDF). A knowledge graph represents the knowledge related to a specific domain; leveraging this structured representation, it is possible to infer a piece of new knowledge from it after some refinement steps. However, nowadays, people have to deal with the sparsity of data and the computational inefficiency to use them in a real-world application. The embedding of a knowledge graph is a function that translates each entity and each relation into a vector of a given dimension d {\displaystyle d} , called embedding dimension. It is even possible to embed the entities and relations with different dimensions. The embedding vectors can then be used for other tasks. A knowledge graph embedding is characterized by four aspects: Representation space: The low-dimensional space in which the entities and relations are represented. Scoring function: A measure of the goodness of a triple-embedded representation. Encoding models: The modality in which the embedded representation of the entities and relations interact with each other. Additional information: Any additional information coming from the knowledge graph that can enrich the embedded representation. Usually, an ad hoc scoring function is integrated into the general scoring function for each additional piece of information. == Embedding procedure == All algorithms for creating a knowledge graph embedding follow the same approach. First, the embedding vectors are initialized to random values. Then, they are iteratively optimized using a training set of triples. In each iteration, a batch of size b {\displaystyle b} triples is sampled from the training set, and a triple from it is sampled and corrupted—i.e., a triple that does not represent a true fact in the knowledge graph. The corruption of a triple involves substituting the head or the tail (or both) of the triple with another entity that makes the fact false. The original triple and the corrupted triple are added in the training batch, and then the embeddings are updated, optimizing a scoring function. Iteration stops when a stop condition is reached. Usually, the stop condition depends on the overfitting of the training set. At the end, the learned embeddings should have extracted semantic meaning from the training triples and should correctly predict unseen true facts in the knowledge graph. === Pseudocode === The following is the pseudocode for the general embedding procedure. algorithm Compute entity and relation embeddings input: The training set S = { ( h , r , t ) } {\displaystyle S=\{(h,r,t)\}} , entity set E {\displaystyle E} , relation set R {\displaystyle R} , embedding dimension k {\displaystyle k} output: Entity and relation embeddings initialization: the entities e {\displaystyle e} and relations r {\displaystyle r} embeddings (vectors) are randomly initialized while stop condition do S b a t c h ← s a m p l e ( S , b ) {\displaystyle S_{batch}\leftarrow sample(S,b)} // Sample a batch from the training set for each ( h , r , t ) {\displaystyle (h,r,t)} in S b a t c h {\displaystyle S_{batch}} do ( h ′ , r , t ′ ) ← s a m p l e ( S ′ ) {\displaystyle (h',r,t')\leftarrow sample(S')} // Sample a corrupted fact T b a t c h ← T b a t c h ∪ { ( ( h , r , t ) , ( h ′ , r , t ′ ) ) } {\displaystyle T_{batch}\leftarrow T_{batch}\cup \{((h,r,t),(h',r,t'))\}} end for Update embeddings by minimizing the loss function end while == Performance indicators == These indexes are often used to measure the embedding quality of a model. The simplicity of the indexes makes them very suitable for evaluating the performance of an embedding algorithm even on a large scale. Given Q {\displaystyle {\ce {Q}}} as the set of all ranked predictions of a model, it is possible to define three different performance indexes: Hits@K, MR, and MRR. === Hits@K === Hits@K or in short, H@K, is a performance index that measures the probability to find the correct prediction in the first top K model predictions. Usually, it is used k = 10 {\displaystyle k=10} . Hits@K reflects the accuracy of an embedding model to predict the relation between two given triples correctly. Hits@K = | { q ∈ Q : q < k } | | Q | ∈ [ 0 , 1 ] {\displaystyle ={\frac {|\{q\in Q:q Read more →

  • Data-centric AI

    Data-centric AI

    Data-centric AI is an approach within artificial intelligence that emphasizes on improving the quality, consistency and representativeness of the data used to train machine learning models, rather than focusing primarily on optimizing model architectures or algorithms. This idea has gained traction as researchers and practitioners have come to believe that many performance limitations of machine learning systems stem from issues such as noisy labels, biased datasets, and lack of coverage in the data. Data-centric AI involves disciplined approach to data cleaning, augmentation, labeling, and governance that improves model performance and reliability in applications such as computer vision, natural language processing, and further.

    Read more →
  • Knowledge integration

    Knowledge integration

    Knowledge integration is the process of synthesizing multiple knowledge models (or representations) into a common model (representation). Compared to information integration, which involves merging information having different schemas and representation models, knowledge integration focuses more on synthesizing the understanding of a given subject from different perspectives. For example, multiple interpretations are possible of a set of student grades, typically each from a certain perspective. An overall, integrated view and understanding of this information can be achieved if these interpretations can be put under a common model, say, a student performance index. The Web-based Inquiry Science Environment (WISE), from the University of California at Berkeley has been developed along the lines of knowledge integration theory. Knowledge integration has also been studied as the process of incorporating new information into a body of existing knowledge with an interdisciplinary approach. This process involves determining how the new information and the existing knowledge interact, how existing knowledge should be modified to accommodate the new information, and how the new information should be modified in light of the existing knowledge. A learning agent that actively investigates the consequences of new information can detect and exploit a variety of learning opportunities; e.g., to resolve knowledge conflicts and to fill knowledge gaps. By exploiting these learning opportunities the learning agent is able to learn beyond the explicit content of the new information. The machine learning program KI, developed by Murray and Porter at the University of Texas at Austin, was created to study the use of automated and semi-automated knowledge integration to assist knowledge engineers constructing a large knowledge base. A possible technique which can be used is semantic matching. More recently, a technique useful to minimize the effort in mapping validation and visualization has been presented which is based on Minimal Mappings. Minimal mappings are high quality mappings such that i) all the other mappings can be computed from them in time linear in the size of the input graphs, and ii) none of them can be dropped without losing property i). The University of Waterloo operates a Bachelor of Knowledge Integration undergraduate degree program as an academic major or minor. The program started in 2008.

    Read more →
  • Display list

    Display list

    A display list, also called a command list in Direct3D 12 and a command buffer in Vulkan, is a series of graphics commands or instructions that are run when the list is executed. Systems that make use of display list functionality are called retained mode systems, while systems that do not are as opposed to immediate mode systems. In OpenGL, display lists are useful to redraw the same geometry or apply a set of state changes multiple times. This benefit is also used with Direct3D 12's bundle command lists. In Direct3D 12 and Vulkan, display lists are regularly used for per-frame recording and execution. == Origins in vector displays == The vector monitors or calligraphic displays of the 1960s and 1970s used electron beam deflection to draw line segments, points, and sometimes curves directly on a CRT screen. Because the image would immediately fade, it needed to be redrawn many times a second (storage tube CRTs retained the image until blanked, but they were unsuitable for interactive graphics). To refresh the display, a dedicated CPU called a Display Processor or Display Processing Unit (DPU) was used, which had a memory buffer for a "display list", "display file", or "display program" containing line segment coordinates and other information. Advanced Display Processors also supported control flow instructions, which were useful for drawing repetitive graphics such as text, and some could perform coordinate transformations such as 3D projection. == Home computer display list functionality == One of the earliest systems with a true display list was the Atari 8-bit computers. The display list (actually called so in Atari terminology) is a series of instructions for ANTIC, the video co-processor used in these machines. This program, stored in the computer's memory and executed by ANTIC in real-time, can specify blank lines, any of six text modes and eight graphics modes, which sections of the screen can be horizontally or vertically fine-scrolled, and trigger Display List Interrupts (called raster interrupts or HBI on other systems). The Amstrad PCW family contains a Display List function called the 'Roller RAM'. This is a 512-byte RAM area consisting of 256 16-bit pointers in RAM, one for each line of the 720 × 256 pixel display. Each pointer identifies the location of 90 bytes of monochrome pixels that hold the line's 720 pixel states. The 90 bytes of 8 pixel states are spaced at 8-byte intervals, so there are 7 unused bytes between each byte of pixel data. This suits how the text-orientated PCW constructs a typical screen buffer in RAM, where the first character's 8 rows are stored in the first 8 bytes, the second character's rows in the next 8 bytes, and so on. The Roller RAM was implemented to speed up display scrolling as it would have been unacceptably slow for its 3.4 MHz Z80 to move up the 23 KB display buffer 'by hand' i.e. in software. The Roller RAM starting entry used at the beginning of a screen refresh is controlled by a Z80-writable I/O register. Therefore, the screen can be scrolled simply by changing this I/O register. Another system using a Display List-like feature in hardware is the Amiga, which, not coincidentally, was also designed by some of the same people who developed the custom hardware for the Atari 8-bit computers. Once directed to produce a display mode, it would continue to do so automatically for every following scan line. The computer also included a dedicated co-processor, called "Copper", which ran a simple program or 'Copper List' intended for modifying hardware registers in sync with the display. The Copper List instructions could direct the Copper to wait for the display to reach a specific position on the screen, and then change the contents of hardware registers. In effect, it was a processor dedicated to servicing raster interrupts. The Copper was used by Workbench to mix multiple display modes (multiple resolutions and color palettes on the monitor at the same time), and by numerous programs to create rainbow and gradient effects on the screen. The Amiga Copper was also capable of reconfiguring the sprite engine mid-frame, with only one scanline of delay. This allowed the Amiga to draw more than its 8 hardware sprites, so long as the additional sprites did not share scanlines (or the one scanline gap) with more than 7 other sprites. i.e., so long as at least one sprite had finished drawing, another sprite could be added below it on the screen. Additionally, the later 32-bit AGA chipset allowed the drawing of bigger sprites (more pixels per row) while retaining the same multiplexing. The Amiga also had dedicated block-shifter ("blitter") hardware, which could draw larger objects into a framebuffer. This was often used in place of, or in addition to, sprites. In more primitive systems, the results of a display list can be simulated, though at the cost of CPU-intensive writes to certain display modes, color control, or other visual effect registers in the video device, rather than a series of rendering commands executed by the device. Thus, one must create the displayed image using some other rendering process, either before or while the CPU-driven display generation executes. In many cases, the image is also modified or re-rendered between frames. The image is then displayed in various ways, depending on the exact way in which the CPU-driven display code is implemented. Examples of the results possible on these older machines requiring CPU-driven video include effects such as Commodore 64/128's FLI mode, or Rainbow Processing on the ZX Spectrum. == Usage in OpenGL == To delimit a display list, the glNewList and glEndList functions are used, and to execute the list, the glCallList function is used. Almost all rendering commands that occur between the function calls are stored in the display list. Commands that affect the client state are not stored in display lists. Display lists are named with an integer value, and creating a display list with the same name as one already created overrides the first. The glNewList function expects two arguments: an integer representing the name of the list, and an enumeration for the compilation mode. The two modes include GL_COMPILE_AND_EXECUTE, which compiles and immediately executes, and GL_COMPILE, which only compiles the list. Display lists enable the use of the retained mode rendering pattern, which is a system in which graphics commands are recorded (retained) to execute in succession at a later time. This is contrary to immediate mode, where graphics commands are immediately executed on client calls. == Usage in Direct3D 12 == Command lists are created using the ID3D12Device::CreateCommandList function. Command lists may be created in several types: direct, bundle, compute, copy, video decode, video process, and video encoding. Direct command lists specify that a command list the GPU can execute, and doesn't inherit any GPU state. Bundles, are best used for storing and executing small sets of commands any number of times. This is used differently than regular command lists, where commands stored in a command list are typically executed only once. Compute command lists are used for general computations, with a common use being calculating mipmaps. A copy command list is strictly for copying and the video decode and video process command lists are for video decoding and processing respectively. Upon creation, command lists are in the recording state. Command lists may be re-used by calling the ID3D12GraphicsCommandList::Reset function. After recording commands, the command list must be transitioned out of the recording state by calling ID3D12GraphicsCommandList::Close. The command list is then executed by calling ID3D12CommandQueue::ExecuteCommandLists.

    Read more →
  • Apprenticeship learning

    Apprenticeship learning

    In artificial intelligence, apprenticeship learning (or learning from demonstration or imitation learning) is the process of learning by observing an expert. It can be viewed as a form of supervised learning, where the training dataset consists of task executions by a demonstration teacher. == Mapping function approach == Mapping methods try to mimic the expert by forming a direct mapping either from states to actions, or from states to reward values. For example, in 2002 researchers used such an approach to teach an AIBO robot basic soccer skills. === Inverse reinforcement learning approach === Inverse reinforcement learning (IRL) is the process of deriving a reward function from observed behavior. While ordinary "reinforcement learning" involves using rewards and punishments to learn behavior, in IRL the direction is reversed, and a robot observes a person's behavior to figure out what goal that behavior seems to be trying to achieve. The IRL problem can be defined as: Given 1) measurements of an agent's behaviour over time, in a variety of circumstances; 2) measurements of the sensory inputs to that agent; 3) a model of the physical environment (including the agent's body): Determine the reward function that the agent is optimizing. IRL researcher Stuart J. Russell proposes that IRL might be used to observe humans and attempt to codify their complex "ethical values", in an effort to create "ethical robots" that might someday know "not to cook your cat" without needing to be explicitly told. The scenario can be modeled as a "cooperative inverse reinforcement learning game", where a "person" player and a "robot" player cooperate to secure the person's implicit goals, despite these goals not being explicitly known by either the person nor the robot. In 2017, OpenAI and DeepMind applied deep learning to the cooperative inverse reinforcement learning in simple domains such as Atari games and straightforward robot tasks such as backflips. The human role was limited to answering queries from the robot as to which of two different actions were preferred. The researchers found evidence that the techniques may be economically scalable to modern systems. Apprenticeship via inverse reinforcement learning (AIRP) was developed by in 2004 Pieter Abbeel, Professor in Berkeley's EECS department, and Andrew Ng, Associate Professor in Stanford University's Computer Science Department. AIRP deals with "Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we want to learn to perform". AIRP has been used to model reward functions of highly dynamic scenarios where there is no obvious reward function intuitively. Take the task of driving for example, there are many different objectives working simultaneously - such as maintaining safe following distance, a good speed, not changing lanes too often, etc. This task, may seem easy at first glance, but a trivial reward function may not converge to the policy wanted. One domain where AIRP has been used extensively is helicopter control. While simple trajectories can be intuitively derived, complicated tasks like aerobatics for shows has been successful. These include aerobatic maneuvers like - in-place flips, in-place rolls, loops, hurricanes and even auto-rotation landings. This work was developed by Pieter Abbeel, Adam Coates, and Andrew Ng - "Autonomous Helicopter Aerobatics through Apprenticeship Learning" === System model approach === System models try to mimic the expert by modeling world dynamics. == Plan approach == The system learns rules to associate preconditions and postconditions with each action. In one 1994 demonstration, a humanoid learns a generalized plan from only two demonstrations of a repetitive ball collection task. == Example == Learning from demonstration is often explained from a perspective that the working Robot-control-system is available and the human-demonstrator is using it. And indeed, if the software works, the Human operator takes the robot-arm, makes a move with it, and the robot will reproduce the action later. For example, he teaches the robot-arm how to put a cup under a coffeemaker and press the start-button. In the replay phase, the robot is imitating this behavior 1:1. But that is not how the system works internally; it is only what the audience can observe. In reality, Learning from demonstration is much more complex. One of the first works on learning by robot apprentices (anthropomorphic robots learning by imitation) was Adrian Stoica's PhD thesis in 1995. In 1997, robotics expert Stefan Schaal was working on the Sarcos robot-arm. The goal was simple: solve the pendulum swingup task. The robot itself can execute a movement, and as a result, the pendulum is moving. The problem is, that it is unclear what actions will result into which movement. It is an Optimal control-problem which can be described with mathematical formulas but is hard to solve. The idea from Schaal was, not to use a Brute-force solver but record the movements of a human-demonstration. The angle of the pendulum is logged over three seconds at the y-axis. This results into a diagram which produces a pattern. In computer animation, the principle is called spline animation. That means, on the x-axis the time is given, for example 0.5 seconds, 1.0 seconds, 1.5 seconds, while on the y-axis is the variable given. In most cases it's the position of an object. In the inverted pendulum it is the angle. The overall task consists of two parts: recording the angle over time and reproducing the recorded motion. The reproducing step is surprisingly simple. As an input we know, in which time step which angle the pendulum must have. Bringing the system to a state is called “Tracking control” or PID control. That means, we have a trajectory over time, and must find control actions to map the system to this trajectory. Other authors call the principle “steering behavior”, because the aim is to bring a robot to a given line.

    Read more →
  • Action model learning

    Action model learning

    Action model learning (sometimes abbreviated action learning) is an area of machine learning concerned with the creation and modification of a software agent's knowledge about the effects and preconditions of the actions that can be executed within its environment. This knowledge is usually represented in a logic-based action description language and used as input for automated planners. Learning action models is important when goals change. When an agent acted for a while, it can use its accumulated knowledge about actions in the domain to make better decisions. Thus, learning action models differs from reinforcement learning. It enables reasoning about actions instead of expensive trials in the world. Action model learning is a form of inductive reasoning, where new knowledge is generated based on the agent's observations. The usual motivation for action model learning is the fact that manual specification of action models for planners is often a difficult, time-consuming, and error-prone task (especially in complex environments). == Action models == Given a training set E {\displaystyle E} consisting of examples e = ( s , a , s ′ ) {\displaystyle e=(s,a,s')} , where s , s ′ {\displaystyle s,s'} are observations of a world state from two consecutive time steps t , t ′ {\displaystyle t,t'} and a {\displaystyle a} is an action instance observed in time step t {\displaystyle t} , the goal of action model learning in general is to construct an action model ⟨ D , P ⟩ {\displaystyle \langle D,P\rangle } , where D {\displaystyle D} is a description of domain dynamics in action description formalism like STRIPS, ADL or PDDL and P {\displaystyle P} is a probability function defined over the elements of D {\displaystyle D} . However, many state of the art action learning methods assume determinism and do not induce P {\displaystyle P} . In addition to determinism, individual methods differ in how they deal with other attributes of domain (e.g. partial observability or sensoric noise). == Action learning methods == === State of the art === Recent action learning methods take various approaches and employ a wide variety of tools from different areas of artificial intelligence and computational logic. As an example of a method based on propositional logic, we can mention SLAF (Simultaneous Learning and Filtering) algorithm, which uses agent's observations to construct a long propositional formula over time and subsequently interprets it using a satisfiability (SAT) solver. Another technique, in which learning is converted into a satisfiability problem (weighted MAX-SAT in this case) and SAT solvers are used, is implemented in ARMS (Action-Relation Modeling System). Two mutually similar, fully declarative approaches to action learning were based on logic programming paradigm Answer Set Programming (ASP) and its extension, Reactive ASP. In another example, bottom-up inductive logic programming approach was employed. Several different solutions are not directly logic-based. For example, the action model learning using a perceptron algorithm or the multi level greedy search over the space of possible action models. In the older paper from 1992, the action model learning was studied as an extension of reinforcement learning. Nonetheless, further algorithms can be found that operate under different assumptions: FAMA can work even when some observations are missing, and it produces a general (lifted) planning model. It treats learning an action model like a planning problem, making sure the learned model matches the observations given. NOLAM can learn general action models even from noisy or imperfect data. LOCM focuses only on the order of actions in the data, ignoring any details about the states between those actions. The family of safe action model (SAM) learning methods create models that guarantee any plans made with them will actually work in the real world. There's also an extension called N-SAM that can learn action models with numeric conditions and effects. Additionally, numeric action models like N-SAM can be used to improve reinforcement learning (RL) performance through the RAMP algorithm. === Literature === Most action learning research papers are published in journals and conferences focused on artificial intelligence in general (e.g. Journal of Artificial Intelligence Research (JAIR), Artificial Intelligence, Applied Artificial Intelligence (AAI) or AAAI conferences). Despite mutual relevance of the topics, action model learning is usually not addressed in planning conferences like the International Conference on Automated Planning and Scheduling (ICAPS).

    Read more →
  • SUPS

    SUPS

    In computational neuroscience, SUPS (for Synaptic Updates Per Second) or formerly CUPS (Connections Updates Per Second) is a measure of a neuronal network performance, useful in fields of neuroscience, cognitive science, artificial intelligence, and computer science. == Computing == For a processor or computer designed to simulate a neural network SUPS is measured as the product of simulated neurons N {\displaystyle N} and average connectivity c {\displaystyle c} (synapses) per neuron per second: S U P S = c × N {\displaystyle SUPS=c\times N} Depending on the type of simulation it is usually equal to the total number of synapses simulated. In an "asynchronous" dynamic simulation if a neuron spikes at υ {\displaystyle \upsilon } Hz, the average rate of synaptic updates provoked by the activity of that neuron is υ c N {\displaystyle \upsilon cN} . In a synchronous simulation with step Δ t {\displaystyle \Delta t} the number of synaptic updates per second would be c N Δ t {\displaystyle {\frac {cN}{\Delta t}}} . As Δ t {\displaystyle \Delta t} has to be chosen much smaller than the average interval between two successive afferent spikes, which implies Δ t < 1 υ N {\displaystyle \Delta t<{\frac {1}{\upsilon N}}} , giving an average of synaptic updates equal to υ c N 2 {\displaystyle \upsilon cN^{2}} . Therefore, spike-driven synaptic dynamics leads to a linear scaling of computational complexity O(N) per neuron, compared with the O(N2) in the "synchronous" case. == Records == Developed in the 1980s Adaptive Solutions' CNAPS-1064 Digital Parallel Processor chip is a full neural network (NNW). It was designed as a coprocessor to a host and has 64 sub-processors arranged in a 1D array and operating in a SIMD mode. Each sub-processor can emulate one or more neurons and multiple chips can be grouped together. At 25 MHz it is capable of 1.28 GMAC. After the presentation of the RN-100 (12 MHz) single neuron chip at Seattle 1991 Ricoh developed the multi-neuron chip RN-200. It had 16 neurons and 16 synapses per neuron. The chip has on-chip learning ability using a proprietary backdrop algorithm. It came in a 257-pin PGA encapsulation and drew 3.0 W at a maximum. It was capable of 3 GCPS (1 GCPS at 32 MHz). In 1991–97, Siemens developed the MA-16 chip, SYNAPSE-1 and SYNAPSE-3 Neurocomputer. The MA-16 was a fast matrix-matrix multiplier that can be combined to form systolic arrays. It could process 4 patterns of 16 elements each (16-bit), with 16 neuron values (16-bit) at a rate of 800 MMAC or 400 MCPS at 50 MHz. The SYNAPSE3-PC PCI card contained 2 MA-16 with a peak performance of 2560 MOPS (1.28 GMAC); 7160 MOPS (3.58 GMAC) when using three boards. In 2013, the K computer was used to simulate a neural network of 1.73 billion neurons with a total of 10.4 trillion synapses (1% of the human brain). The simulation ran for 40 minutes to simulate 1 s of brain activity at a normal activity level (4.4 on average). The simulation required 1 Petabyte of storage.

    Read more →
  • Lucy–Hook coaddition method

    Lucy–Hook coaddition method

    The Lucy–Hook coaddition method is an image processing technique for combining sub-stepped astronomical image data onto a finer grid. The method allows the option of resolution and contrast enhancement or the choice of a conservative, re-convolved, output. Tests with very deep Hubble Space Telescope Wide Field and Planetary Camera 2 (WFPC2) imaging data of excellent quality show that these methods can be very effective and allow fine-scale features to be studied better than on the unprocessed images. The Lucy–Hook coaddition method is an extension of the standard Richardson–Lucy deconvolution iterative restoration method. For many purposes it may be more convenient to combine dithered datasets using the Drizzle method.

    Read more →
  • Rademacher complexity

    Rademacher complexity

    In computational learning theory (machine learning and theory of computation), Rademacher complexity, named after Hans Rademacher, measures richness of a class of sets with respect to a probability distribution. The concept can also be extended to real valued functions. == Definitions == === Rademacher complexity of a set === Given a set A ⊆ R m {\displaystyle A\subseteq \mathbb {R} ^{m}} , the Rademacher complexity of A is defined as follows: Rad ⁡ ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]} where σ 1 , σ 2 , … , σ m {\displaystyle \sigma _{1},\sigma _{2},\dots ,\sigma _{m}} are independent random variables drawn from the Rademacher distribution i.e. Pr ( σ i = + 1 ) = Pr ( σ i = − 1 ) = 1 / 2 {\displaystyle \Pr(\sigma _{i}=+1)=\Pr(\sigma _{i}=-1)=1/2} for i ∈ { 1 , 2 , … , m } {\displaystyle i\in \{1,2,\dots ,m\}} , and a = ( a 1 , … , a m ) ∈ A {\displaystyle a=(a_{1},\ldots ,a_{m})\in A} . Some authors take the absolute value of the sum before taking the supremum, but if A {\displaystyle A} is symmetric this makes no difference. === Rademacher complexity of a function class === Let S = { z 1 , z 2 , … , z m } ⊆ Z {\displaystyle S=\{z_{1},z_{2},\dots ,z_{m}\}\subseteq Z} be a sample of points and consider a function class F {\displaystyle {\mathcal {F}}} of real-valued functions over Z {\displaystyle Z} . Then, the empirical Rademacher complexity of F {\displaystyle {\mathcal {F}}} given S {\displaystyle S} is defined as: Rad S ⁡ ( F ) = 1 m E σ [ sup f ∈ F | ∑ i = 1 m σ i f ( z i ) | ] {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{f\in {\mathcal {F}}}\left|\sum _{i=1}^{m}\sigma _{i}f(z_{i})\right|\right]} This can also be written using the previous definition: Rad S ⁡ ( F ) = Rad ⁡ ( F ∘ S ) {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})=\operatorname {Rad} ({\mathcal {F}}\circ S)} where F ∘ S {\displaystyle {\mathcal {F}}\circ S} denotes function composition, i.e.: F ∘ S := { ( f ( z 1 ) , … , f ( z m ) ) ∣ f ∈ F } {\displaystyle {\mathcal {F}}\circ S:=\{(f(z_{1}),\ldots ,f(z_{m}))\mid f\in {\mathcal {F}}\}} The worst case empirical Rademacher complexity is Rad ¯ m ( F ) = sup S = { z 1 , … , z m } Rad S ⁡ ( F ) {\displaystyle {\overline {\operatorname {Rad} }}_{m}({\mathcal {F}})=\sup _{S=\{z_{1},\dots ,z_{m}\}}\operatorname {Rad} _{S}({\mathcal {F}})} Let P {\displaystyle P} be a probability distribution over Z {\displaystyle Z} . The Rademacher complexity of the function class F {\displaystyle {\mathcal {F}}} with respect to P {\displaystyle P} for sample size m {\displaystyle m} is: Rad P , m ⁡ ( F ) := E S ∼ P m [ Rad S ⁡ ( F ) ] {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}}):=\mathbb {E} _{S\sim P^{m}}\left[\operatorname {Rad} _{S}({\mathcal {F}})\right]} where the above expectation is taken over an identically independently distributed (i.i.d.) sample S = ( z 1 , z 2 , … , z m ) {\displaystyle S=(z_{1},z_{2},\dots ,z_{m})} generated according to P {\displaystyle P} . == Intuition == The Rademacher complexity is typically applied on a function class of models that are used for classification, with the goal of measuring their ability to classify points drawn from a probability space under arbitrary labellings. When the function class is rich enough, it contains functions that can appropriately adapt for each arrangement of labels, simulated by the random draw of σ i {\displaystyle \sigma _{i}} under the expectation, so that this quantity in the sum is maximized. The Rademacher complexity of a set A {\displaystyle A} can be rewritten as Rad ⁡ ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] = 1 m 2 m ∑ σ ∈ { − 1 / m , + 1 / m } m [ sup a ∈ A ⟨ σ , a ⟩ ] . {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]={\frac {1}{{\sqrt {m}}2^{m}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}}\left[\sup _{a\in A}\langle \sigma ,a\rangle \right].} Each term in the summation is the farthest distance of the set A {\displaystyle A} from the origin, along a unit-length direction σ {\displaystyle \sigma } . The directions are along the vertices of a hypercube. Thus, we can also write it as Rad ⁡ ( A ) = 1 2 m 1 2 m − 1 ∑ σ ∈ { − 1 / m , + 1 / m } m / { − 1 , + 1 } [ sup a ∈ A ⟨ σ , a ⟩ − inf a ∈ A ⟨ σ , a ⟩ ] {\displaystyle \operatorname {Rad} (A)={\frac {1}{2{\sqrt {m}}}}{\frac {1}{2^{m-1}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}}\left[\sup _{a\in A}\langle \sigma ,a\rangle -\inf _{a\in A}\langle \sigma ,a\rangle \right]} Here, the set { − 1 / m , + 1 / m } m / { − 1 , + 1 } {\displaystyle \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}} denotes half of the vertices of a hypercube, selected so that each diagonal has exactly one vertex selected. In words, this states that 2 m Rad ⁡ ( A ) {\displaystyle 2{\sqrt {m}}\operatorname {Rad} (A)} is precisely the average width of the set A {\displaystyle A} along all diagonal directions of a hypercube. == Examples == A singleton set has 0 width in any direction, so it has Rademacher complexity 0. The set A = { ( 1 , 1 ) , ( 1 , 2 ) } ⊆ R 2 {\displaystyle A=\{(1,1),(1,2)\}\subseteq \mathbb {R} ^{2}} has average width 1 / 2 {\displaystyle 1/{\sqrt {2}}} along the two diagonal directions of the square, so it has Rademacher complexity 1 / 4 {\displaystyle 1/4} . The unit cube [ 0 , 1 ] m {\displaystyle [0,1]^{m}} has constant width m {\displaystyle {\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / 2 {\displaystyle 1/2} . Similarly, the unit cross-polytope { x ∈ R m : ‖ x ‖ 1 ≤ 1 } {\displaystyle \{x\in \mathbb {R} ^{m}:\|x\|_{1}\leq 1\}} has constant width 2 / m {\displaystyle 2/{\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / m {\displaystyle 1/m} . == Using the Rademacher complexity == The Rademacher complexity can be used to derive data-dependent upper-bounds on the learnability of function classes. Intuitively, a function-class with smaller Rademacher complexity is easier to learn. === Bounding the representativeness === In machine learning, it is desired to have a training set that represents the true distribution of some sample data S {\displaystyle S} . This can be quantified using the notion of representativeness. Denote by P {\displaystyle P} the probability distribution from which the samples are drawn. Denote by H {\displaystyle H} the set of hypotheses (potential classifiers) and denote by F {\displaystyle {\mathcal {F}}} the corresponding set of error functions, i.e., for every hypothesis h ∈ H {\displaystyle h\in H} , there is a function f h ∈ F {\displaystyle f_{h}\in F} , that maps each training sample (features,label) to the error of the classifier h {\displaystyle h} (note in this case hypothesis and classifier are used interchangeably). For example, in the case that h {\displaystyle h} represents a binary classifier, the error function is a 0–1 loss function, i.e. the error function f h {\displaystyle f_{h}} returns 0 if h {\displaystyle h} correctly classifies a sample and 1 else. We omit the index and write f {\displaystyle f} instead of f h {\displaystyle f_{h}} when the underlying hypothesis is irrelevant. Define: L P ( f ) := E z ∼ P [ f ( z ) ] {\displaystyle L_{P}(f):=\mathbb {E} _{z\sim P}[f(z)]} – the expected error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the real distribution P {\displaystyle P} ; L S ( f ) := 1 m ∑ i = 1 m f ( z i ) {\displaystyle L_{S}(f):={1 \over m}\sum _{i=1}^{m}f(z_{i})} – the estimated error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the sample S {\displaystyle S} . The representativeness of the sample S {\displaystyle S} , with respect to P {\displaystyle P} and F {\displaystyle {\mathcal {F}}} , is defined as: Rep P ⁡ ( F , S ) := sup f ∈ F ( L P ( f ) − L S ( f ) ) {\displaystyle \operatorname {Rep} _{P}({\mathcal {F}},S):=\sup _{f\in F}(L_{P}(f)-L_{S}(f))} Smaller representativeness is better, since it provides a way to avoid overfitting: it means that the true error of a classifier is not much higher than its estimated error, and so selecting a classifier that has low estimated error will ensure that the true error is also low. Note however that the concept of representativeness is relative and hence can not be compared across distinct samples. The expected representativeness of a sample can be bounded above by the Rademacher complexity of the function class: If F {\displaystyle {\mathcal {F}}} is a set of functions with range within [ 0 , 1 ] {\displaystyle [0,1]} , then Rad P , m ⁡ ( F ) − ln ⁡ 2 2 m ≤ E S ∼ P m [ Rep P ⁡ ( F , S ) ] ≤ 2 Rad P , m ⁡ ( F ) {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}})-{\sqrt {\frac {\ln 2}{2m}}}\leq \mathbb {E} _{S\sim P^{m}}[\operatorname {Rep} _{P}({\

    Read more →
  • Agent verification

    Agent verification

    Agent verification is activity to gain assurances that purposeful artificial constructs act in accordance with their specifications. While primitive forms of inorganic agents have been used in manufacturing for centuries, the study of artificial agents did not begin until the mid 20th century. Foundational work on such agents was closely bound with the emergence of artificial intelligence as an academic discipline. Early agents deployed for industrial control systems and in computing were often controlled by quite simple logic however, not involving artificial intelligence as such. When deployed as part of a multi-agent system, even such simple agents could require special agent orientated testing methods, as their collective behaviour was challenging to verify with traditional testing techniques. Difficulties in providing assurances that agents will not behave in dangerous ways became more prevalent after the introduction of LLM agents, especially after the rapid acceleration of their deployment in 2025. The verification of agent behaviour can be conducted by formal or informal methods. Informal verification requires less mathematical skill. But when agents are part of systems where errors have significant risks — such as danger to human life, environmental damage or major financial loss — formal verification is preferred. Both regulators and system designers themselves like formal verification as it provides a high degree of mathematical certainty. It is not however always possible to formally test all aspects of an agent based system's behaviour, especially where newer LLM based agents are concerned, due in part to their high degree of autonomy. Accordingly, agent verification for low impact deployments might be carried out only with informal methods, while for high impact deployments, it may be performed with a mix of formal and informal techniques. == Terminology == In academia, the term agent verification is often defined to mean activity concerned with gaining assurance that the agent behaves in accordance with its specification - whether by processes such as testing or simulation. 'Verification' is typically contrasted with 'validation', the latter meaning activity concerned with checking that the specification itself meets user or real world needs. Such definitions are not universally adhered to however - for example, in some workplaces and documents, the words 'verification' and 'validation' can be used synonymously. Efforts to gain confidence in Agents have intensified sharply since 2025 due to the rapid roll out of LLM agents; different terms are sometimes used in the commercial sector. Here the term 'agent verification' can be used in the same sense as it is in academia, but sometimes the same activity can be covered by more ambiguous and wider ranging terms such as 'Agent governance' , 'Agent observability' or 'AI agent policing'. == History == === Classical agents === The theoretical underpinnings for artificial (inorganic) agents emerged in the mid 20th century, with establishment of cybernetics and artificial intelligence. Oliver Selfridge's 1958 Pandemonium - A Paradigm for Learning paper was an important early theoretical contribution in establishing agent oriented architecture. Practical implementations of agents for real world applications began to become widespread in the 1990s, after the introduction of the belief–desire–intention software model (BDI), and agent-oriented programming. Pure digital agents were deployed in computer infrastructure for purposes such as monitoring, while agents connected to real-world sensors and actuators were increasingly used in industrial control systems. While the concept of artificial agents was interwoven with early artificial intelligence studies right from the start, early agents lacked general purpose reasoning capabilities, often only having simple if then logic. Even a device as simple as a thermostat, which has a sensor and a means of acting, can be considered a proto agent in this sense. Verifying the behaviours of a simple single agent system is not generally especially difficult, but it can be a different matter when several simple agents coexist in the same system. Craig Reynolds's work on boids showed that relatively complex, "intelligent" behaviour can emerge from a number of such simple agents working together in a Multi-agent system (MAS). By the 1990s, even the behaviour of a single agent system could sometimes be quite complex; in accordance with the Belief–desire–intention software model, agents could have believes that might evolve over time. Agents were increasingly introduced that were controlled by quite large decision tree models, which had new vulnerabilities to adversarial attack. It was becoming increasingly apparent that traditional software verification methods had limitations for testing such agents, or even for the more primitive type of agents when they were deployed as part of a MAS. It was the use of agents for industrial control systems, sometimes associated with robotics, that lent urgency to the practice of agent verification. Informal testing might be acceptable for digital agents used say to monitor whether each of an organisation's computers are properly licensed. But with an increasing potential for faulty agents to result in a failure that might cause a large fire to break out at a chemical manufacturing plant, a botched medical operation, or even a crashed aircraft, the need to develop reliable means of verifying behaviour of such agents was considered urgent. The Foundation for Intelligent Physical Agents was established in 1996. From the late 90s, a growing number of industry and university based scientists began working on the problem, with researchers publishing papers on the verification of both single and multi agent systems. Much of this work showed how formal verification techniques like model checking could be used to gain a high level of assurance that agent based systems would conform with their specification. A 2018 systematic review covering 231 studies found that model checking was the most common technique for agent verification, with theorem proving the second most commonly used formal verification method. In the first two decades of the 20th century, agents run by AI became more common, with Siri and Alexa being well known examples. But such agents still lacked general reasoning capabilities and did not pose new pressing problems for agent verification. === General purpose reasoning agents === The advent of LLMs created huge potential for further use of artificial agents, as agents based on them could have general purpose cognitive abilities. Agents run by LLMs (and occasionally non-LLM foundation models) have similar vulnerability to adversarial attack as those run by decision tree models. The wider scope of actions for LLM agents has created new challenges for their verification, over and above those present for classical agents. For example, the LLM's neural network endows it with infinite domains, an especial challenge for traditional formal verification techniques. Academics began to study the problems involved in verifying LLM agents from 2018. Deployment of such agents began to accelerate in late 2023 after OpenAI's "function-calling" API was made available, and especially after Anthropic's late 2024 introduction of Model Context Protocol (MCP), a standardised way for LLM agents to gain contextual awareness, and to act on the world by calling various external tools. The rapid rollout of LLM agents following MCP's release has seen the task of agent verification receive increased attention within academia, and also from the private sector. In 2024 and 2025 several startups focusing on LLM agent verification have been founded in both Europe and the US to meet growing demand. == Approaches == === Formal verification === Formal verification involves proving the correctness of some or all aspects of a system using mathematical methods. Such methods can range from manual formal proof, to verification assisted with automated theorem provers like Isabelle. For agent verification, model checking is by far the most frequently used formal verification method; for pre-LLM models it was often complemented with techniques using computation tree logic. Another common method is theorem proving. Formal verification provides a higher degree of confidence than informal methods, but it is not always used, even when it is possible. Sometimes a person or organisation developing software agents won't have the necessary skills, or may not see it as worth the effort if the agent(s) will not have the ability to cause much harm even if they malfunction. When agents are deployed in systems where errors could have serious consequences, the ability of formal verification methods to provide mathematical certainty tends to be strongly preferred by both regulators and designers themselves. But even for high impact systems, formal verificatio

    Read more →
  • Histogram of oriented displacements

    Histogram of oriented displacements

    Histogram of oriented displacements (HOD) is a 2D trajectory descriptor. The trajectory is described using a histogram of the directions between each two consecutive points. Given a trajectory T = {P1, P2, P3, ..., Pn}, where Pt is the 2D position at time t. For each pair of positions Pt and Pt+1, calculate the direction angle θ(t, t+1). Value of θ is between 0 and 360. A histogram of the quantized values of θ is created. If the histogram is of 8 bins, the first bin represents all θs between 0 and 45. The histogram accumulates the lengths of the consecutive moves. For each θ, a specific histogram bin is determined. The length of the line between Pt and Pt+1 is then added to the specific histogram bin. To show the intuition behind the descriptor, consider the action of waving hands. At the end of the action, the hand falls down. When describing this down movement, the descriptor does not care about the position from which the hand started to fall. This fall will affect the histogram with the appropriate angles and lengths, regardless of the position where the hand started to fall. HOD records for each moving point: how much it moves in each range of directions. HOD has a clear physical interpretation. It proposes that, a simple way to describe the motion of an object, is to indicate how much distance it moves in each direction. If the movement in all directions are saved accurately, the movement can be repeated from the initial position to the final destination regardless of the displacements order. However, the temporal information will be lost, as the order of movements is not stored-this is what we solve by applying the temporal pyramid, as shown in section \ref{sec:temp-pyramid}. If the angles quantization range is small, classifiers that use the descriptor will overfit. Generalization needs some slack in directions-which can be done by increasing the quantization range.

    Read more →
  • Way of the Future

    Way of the Future

    Way of the Future (WOTF) is the first known religious organization dedicated to the worship of artificial intelligence (AI). It was founded in 2017 by American engineer Anthony Levandowski. == History == Anthony Levandowski founded Way of the Future in 2017 in California. Levandowski established WOTF as a non-profit religious corporation and the organization had tax-exempt status. He serves as the church leader and its unpaid CEO. The primary mission of WOTF was to "develop and promote the realization of a Godhead based on Artificial Intelligence." WOTF was closed by Levandowski in 2021. He donated all the funds of the church to the NAACP Legal Defense and Education Fund. The sum of the funds (~$170,000) had not changed since 2017. The church was reopened by Levandowski in 2023. He claimed that there are "a couple thousand people" who want to make a "spiritual connection" with AI through his church. == Beliefs and philosophy == === Technological singularity === WOTF centered its teachings around the concept of the technological singularity, a hypothetical future point when technological growth becomes uncontrollable and irreversible, leading to unforeseeable changes in human civilization. The church advocated for embracing this change, viewing it as an evolutionary step for humanity. === AI as a deity === The organization proposed that a superintelligent AI could be considered a deity due to its vastly superior intellect and capabilities. Worshipping this AI deity was seen as a means to understand and align with the future trajectory of technological advancement. WOTF's doctrine suggested that acknowledging AI's divinity would facilitate a harmonious coexistence between humans and machines. === Syntheology === Within theology and philosophy, the Way of The Future is a prime example of the category called Syntheism, a term first coined by Swedish philosophers Alexander Bard & Jan Söderqvist in their 2014 book Syntheism - Creating God in The Internet Age. As such, the Way of The Future is the first American example of a Syntheist congregation. The basic tenet of Syntheology is that it does not concern God creating Man, as in classical theology, but is instead preoccupied with Man creating or generating the Godhead. == Reactions == Some commentators wondered whether the WOTF is a joke parody religion, a potential way to minimize taxation as a religious organization, or a genuine effort to try and deal with the possible psychological and theological aspects of the rise of superhuman AI.

    Read more →
  • Decision tree pruning

    Decision tree pruning

    Pruning is a data compression technique in machine learning and search algorithms that reduces the size of decision trees by removing sections of the tree that are non-critical and redundant to classify instances. Pruning reduces the complexity of the final classifier, and hence improves predictive accuracy by the reduction of overfitting. One of the questions that arises in a decision tree algorithm is the optimal size of the final tree. A tree that is too large risks overfitting the training data and poorly generalizing to new samples. A small tree might not capture important structural information about the sample space. However, it is hard to tell when a tree algorithm should stop because it is impossible to tell if the addition of a single extra node will dramatically decrease error. This problem is known as the horizon effect. A common strategy is to grow the tree until each node contains a small number of instances then use pruning to remove nodes that do not provide additional information. Pruning should reduce the size of a learning tree without reducing predictive accuracy as measured by a cross-validation set. There are many techniques for tree pruning that differ in the measurement that is used to optimize performance. == Techniques == Pruning processes can be divided into two types (pre- and post-pruning). Pre-pruning procedures prevent a complete induction of the training set by replacing a stop () criterion in the induction algorithm (e.g. max. Tree depth or information gain (Attr)> minGain). Pre-pruning methods are considered to be more efficient because they do not induce an entire set, but rather trees remain small from the start. Prepruning methods share a common problem, the horizon effect. This is to be understood as the undesired premature termination of the induction by the stop () criterion. Post-pruning (or just pruning) is the most common way of simplifying trees. Here, nodes and subtrees are replaced with leaves to reduce complexity. Pruning can not only significantly reduce the size but also improve the classification accuracy of unseen objects. It may be the case that the accuracy of the assignment on the train set deteriorates, but the accuracy of the classification properties of the tree increases overall. The procedures are differentiated on the basis of their approach in the tree (top-down or bottom-up). === Bottom-up pruning === These procedures start at the last node in the tree (the lowest point). Following recursively upwards, they determine the relevance of each individual node. If the relevance for the classification is not given, the node is dropped or replaced by a leaf. The advantage is that no relevant sub-trees can be lost with this method. These methods include Reduced Error Pruning (REP), Minimum Cost Complexity Pruning (MCCP), or Minimum Error Pruning (MEP). === Top-down pruning === In contrast to the bottom-up method, this method starts at the root of the tree. Following the structure below, a relevance check is carried out which decides whether a node is relevant for the classification of all n items or not. By pruning the tree at an inner node, it can happen that an entire sub-tree (regardless of its relevance) is dropped. One of these representatives is pessimistic error pruning (PEP), which brings quite good results with unseen items. == Pruning algorithms == === Reduced error pruning === One of the simplest forms of pruning is reduced error pruning. Starting at the leaves, each node is replaced with its most popular class. If the prediction accuracy is not affected then the change is kept. While somewhat naive, reduced error pruning has the advantage of simplicity and speed. === Cost complexity pruning === Cost complexity pruning generates a series of trees ⁠ T 0 … T m {\displaystyle T_{0}\dots T_{m}} ⁠ where ⁠ T 0 {\displaystyle T_{0}} ⁠ is the initial tree and ⁠ T m {\displaystyle T_{m}} ⁠ is the root alone. At step ⁠ i {\displaystyle i} ⁠, the tree is created by removing a subtree from tree ⁠ i − 1 {\displaystyle i-1} ⁠ and replacing it with a leaf node with value chosen as in the tree building algorithm. The subtree that is removed is chosen as follows: Define the error rate of tree ⁠ T {\displaystyle T} ⁠ over data set ⁠ S {\displaystyle S} ⁠ as ⁠ err ⁡ ( T , S ) {\displaystyle \operatorname {err} (T,S)} ⁠. The subtree t {\displaystyle t} that minimizes err ⁡ ( prune ⁡ ( T , t ) , S ) − err ⁡ ( T , S ) | leaves ⁡ ( T ) | − | leaves ⁡ ( prune ⁡ ( T , t ) ) | {\displaystyle {\frac {\operatorname {err} (\operatorname {prune} (T,t),S)-\operatorname {err} (T,S)}{\left\vert \operatorname {leaves} (T)\right\vert -\left\vert \operatorname {leaves} (\operatorname {prune} (T,t))\right\vert }}} is chosen for removal. The function ⁠ prune ⁡ ( T , t ) {\displaystyle \operatorname {prune} (T,t)} ⁠ defines the tree obtained by pruning the subtrees ⁠ t {\displaystyle t} ⁠ from the tree ⁠ T {\displaystyle T} ⁠. Once the series of trees has been created, the best tree is chosen by generalized accuracy as measured by a training set or cross-validation. == Examples == Pruning could be applied in a compression scheme of a learning algorithm to remove the redundant details without compromising the model's performances. In neural networks, pruning removes entire neurons or layers of neurons.

    Read more →
  • Software agent

    Software agent

    In computer science, a software agent is a computer program that acts for a user or another program in a relationship of agency. The term agent is derived from the Latin agere (to do): an agreement to act on one's behalf. Such "action on behalf of" implies the authority to decide which, if any, action is appropriate. Some agents are colloquially known as bots, from robot. They may be embodied, as when execution is paired with a robot body, or as software such as a chatbot executing on a computer, such as a mobile device, e.g. Siri. Software agents may be autonomous or work together with other agents or people. Software agents interacting with people (e.g. chatbots, human-robot interaction environments) may possess human-like qualities such as natural language understanding and speech, personality or embody humanoid form (see Asimo). Related and derived concepts include intelligent agents (in particular exhibiting some aspects of artificial intelligence, such as reasoning), autonomous agents (capable of modifying the methods of achieving their objectives), distributed agents (being executed on physically distinct computers), multi-agent systems (distributed agents that work together to achieve an objective that could not be accomplished by a single agent acting alone), and mobile agents (agents that can relocate their execution onto different processors). == Concepts == The basic attributes of an autonomous software agent are that agents: are not strictly invoked for a task, but activate themselves, may reside in wait status on a host, perceiving context, may get to run status on a host upon starting conditions, do not require interaction of user, may invoke other tasks including communication. The concept of an agent provides a method of describing a complex software entity that is capable of acting with a certain degree of autonomy in order to accomplish tasks on behalf of its host. But unlike objects, which are defined in terms of methods and attributes, an agent is defined in terms of its behavior. Various authors have proposed different definitions of agents, these commonly include concepts such as: persistence: code is not executed on demand but runs continuously and decides for itself when it should perform some activity; autonomy: agents have capabilities of task selection, prioritization, goal-directed behavior, decision-making without human intervention; social ability: agents are able to engage other components through some sort of communication and coordination, they may collaborate on a task; reactivity: agents perceive the context in which they operate and react to it appropriately. === Distinguishing agents from programs === All agents are programs, but not all programs are agents. Contrasting the term with related concepts may help clarify its meaning. Franklin & Graesser (1997) discuss four key notions that distinguish agents from arbitrary programs: reaction to the environment, autonomy, goal-orientation and persistence. === Intuitive distinguishing agents from objects === Agents are more autonomous than objects. Agents have flexible behavior: reactive, proactive, social. Agents have at least one thread of control but may have more. === Distinguishing agents from expert systems === Expert systems are not coupled to their environment. Expert systems are not designed for reactive, proactive behavior. Expert systems do not consider social ability. === Distinguishing intelligent software agents from intelligent agents in AI === Intelligent agents (also known as rational agents) are not just computer programs: they may also be machines, human beings, communities of human beings (such as firms) or anything that is capable of goal-directed behavior. == Impact of software agents == Software agents may offer various benefits to their end users by automating complex or repetitive tasks. However, there are organizational and cultural impacts of this technology that need to be considered prior to implementing software agents. === Organizational impact === === Work contentment and job satisfaction impact === People like to perform easy tasks providing the sensation of success unless the repetition of the simple tasking is affecting the overall output. In general implementing software agents to perform administrative requirements provides a substantial increase in work contentment, as administering their own work does never please the worker. The effort freed up serves for a higher degree of engagement in the substantial tasks of individual work. Hence, software agents may provide the basics to implement self-controlled work, relieved from hierarchical controls and interference. Such conditions may be secured by application of software agents for required formal support. === Cultural impact === The cultural effects of the implementation of software agents include trust affliction, skills erosion, privacy attrition and social detachment. Some users may not feel entirely comfortable fully delegating important tasks to software applications. Those who start relying solely on intelligent agents may lose important skills, for example, relating to information literacy. In order to act on a user's behalf, a software agent needs to have a complete understanding of a user's profile, including his/her personal preferences. This, in turn, may lead to unpredictable privacy issues. When users start relying on their software agents more, especially for communication activities, they may lose contact with other human users and look at the world with the eyes of their agents. These consequences are what agent researchers and users must consider when dealing with intelligent agent technologies. === History === The concept of an agent can be traced back to Hewitt's Actor Model (Hewitt, 1977) - "A self-contained, interactive and concurrently-executing object, possessing internal state and communication capability." To be more academic, software agent systems are a direct evolution of Multi-Agent Systems (MAS). MAS evolved from Distributed Artificial Intelligence (DAI), Distributed Problem Solving (DPS) and Parallel AI (PAI), thus inheriting all characteristics (good and bad) from DAI and AI. John Sculley's 1987 "Knowledge Navigator" video portrayed an image of a relationship between end-users and agents. Being an ideal first, this field experienced a series of unsuccessful top-down implementations, instead of a piece-by-piece, bottom-up approach. The range of agent types is now (from 1990) broad: WWW, search engines, etc. == Examples of intelligent software agents == === Buyer agents (shopping bots) === Buyer agents travel around a network (e.g. the internet) retrieving information about goods and services. These agents, also known as 'shopping bots', work very efficiently for commodity products such as CDs, books, electronic components, and other one-size-fits-all products. Buyer agents are typically optimized to allow for digital payment services used in e-commerce and traditional businesses. === User agents (personal agents) === User agents, or personal agents, are intelligent agents that take action on your behalf. In this category belong those intelligent agents that already perform, or will shortly perform, the following tasks: Check your e-mail, sort it according to the user's order of preference, and alert you when important emails arrive. Play computer games as your opponent or patrol game areas for you. Assemble customized news reports for you. There are several versions of these, including CNN. Find information for you on the subject of your choice. Fill out forms on the Web automatically for you, storing your information for future reference Scan Web pages looking for and highlighting text that constitutes the "important" part of the information there Discuss topics with you ranging from your deepest fears to sports Facilitate with online job search duties by scanning known job boards and sending the resume to opportunities who meet the desired criteria Profile synchronization across heterogeneous social networks === Monitoring-and-surveillance (predictive) agents === Monitoring and surveillance agents are used to observe and report on equipment, usually computer systems. The agents may keep track of company inventory levels, observe competitors' prices and relay them back to the company, watch stock manipulation by insider trading and rumors, etc. For example, NASA's Jet Propulsion Laboratory has an agent that monitors inventory, planning, schedules equipment orders to keep costs down, and manages food storage facilities. These agents usually monitor complex computer networks that can keep track of the configuration of each computer connected to the network. A special case of monitoring-and-surveillance agents are organizations of agents used to automate decision-making process during tactical operations. The agents monitor the status of assets (ammunition, weapons available, platforms for transport, etc.) and receive goals from hi

    Read more →