AI For Students Gemini

AI For Students Gemini — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Contract management software

    Contract management software

    Contract management software constitutes software and associated data management used to support contract management, contract lifecycle management, and contractor management on projects in the procurement of goods and services. It may be used together with project management software. == History == Historically, contract management was seen as a "paper-intensive" process. Early steps from the early 2000's reported by the Aberdeen Group required extensive data conversion work to enable documents to be handled electronically. With the adoption of the European Union's General Data Protection Regulation (GDPR) in 2016, companies needed to take additional steps in regards to contract management. Each data responsible entity was obliged to sign data processing agreements (DPAs) with the various vendors, who treat personal data on behalf of the data responsible. DPAs need to be regularly controlled, adjusted and renewed, which adds an extra agreement to such vendors or at least an extra DPA addendum to each agreement. By 2018, Ardent Partner's research had found that software used for automating contract management activities was being more extensively used among major companies or businesses with "Best-in-Class" procurement teams. Contract management process automation was found to be closely linked with more effective internal business collaboration, standardization and risk management. == Advantages and key functions == Using contract management software can have multiple benefits compared to manually managing paper contracts. This software can help keep track of multiple activities and can have features for automating administration, ensuring compliance, monitoring risk, running reports and triggering alerts. In addition to these types of features, contract management software systems provide a centralized repository for employees to quickly access all contracts worldwide in one place. Contract management software is produced by many companies, working on a range of scales and offering varying degrees of customizability. Basic functions should include the ability to store contract documents, track changes to contract documents, search documents for a particular criterion, send key date alerts and to report required aspects of the contract. Other functions include managing a new contract request, capturing related data, following a document through a review and approval process, and collecting digital signatures. Contract management software may also be an aid to project portfolio management and spend analysis, and may also monitor KPIs. Leading contract management software provides contract visibility, monitoring, and compliance to automate and streamline the contract lifecycle process. Contract management software which uses artificial intelligence (AI) can identify contract types based on pattern recognition. AI contracting software trains its algorithms on a set of contract data to recognize patterns and extract variables such as clauses, dates, and parties. It also offers simple prediction capabilities, by sorting through a large volume of contracts and flagging individual contracts based on specified criteria. AI software can also read contracts in multiple formats and languages, extract contract data, and provide analytics. It can reduce the risk of human error in contract drafting and review. A centralized repository provides a critical advantage allowing for all contract documents to be stored within one location. Having contracts stored in multiple locations can delay and interrupt the contracting process. == Contract risk management software (CRMS) for capital projects == Very large enterprises, such as capital expenditure (capex) projects, involve multiple parties and high risk and uncertainty. They are unlike traditional operating contracts in that they are subject to shared deadlines in unique situations. As the complexity of these unique projects increases, the relationships between parties become more important. This requires contract management software, or contract risk management software (CRMS), to become more dynamic and responsive. The terms of these capex contracts necessarily involve assumptions at the start of the process and are likely to change over the lifetime of the project lifecycle. For this reason, CRMS must be capable of recording one single instance of agreed changes to contract terms and incorporating these changes in an auditable and legally robust way. With multiple decision makers involved, CRMS should also make accountability more transparent and enable faster decisions about variation proposals.

    Read more →
  • Cloud manufacturing

    Cloud manufacturing

    Cloud manufacturing (CMfg) is a new manufacturing paradigm developed from existing advanced manufacturing models (e.g., ASP, AM, NM, MGrid) and enterprise information technologies under the support of cloud computing, Internet of Things (IoT), virtualization and service-oriented technologies, and advanced computing technologies. It transforms manufacturing resources and manufacturing capabilities into manufacturing services, which can be managed and operated in an intelligent and unified way to enable the full sharing and circulating of manufacturing resources and manufacturing capabilities. CMfg can provide safe and reliable, high quality, cheap and on-demand manufacturing services for the whole lifecycle of manufacturing. The concept of manufacturing here refers to big manufacturing that includes the whole lifecycle of a product (e.g. design, simulation, production, test, maintenance). The concept of Cloud manufacturing was initially proposed by the research group led by Prof. Bo Hu Li and Prof. Lin Zhang in China in 2010. Related discussions and research were conducted hereafter, and some similar definitions (e.g. Cloud-Based Design and Manufacturing (CBDM). ) to cloud manufacturing were introduced. Cloud manufacturing is a type of parallel, networked, and distributed system consisting of an integrated and inter-connected virtualized service pool (manufacturing cloud) of manufacturing resources and capabilities as well as capabilities of intelligent management and on-demand use of services to provide solutions for all kinds of users involved in the whole lifecycle of manufacturing. == Types == Cloud Manufacturing can be divided into two categories. The first category concerns deploying manufacturing software on the Cloud, i.e. a “manufacturing version” of Computing. CAx software can be supplied as a service on the Manufacturing Cloud (MCloud). The second category has a broader scope, cutting across production, management, design and engineering abilities in a manufacturing business. Unlike with computing and data storage, manufacturing involves physical equipment, monitors, materials and so on. In this kind of Cloud Manufacturing system, both material and non-material facilities are implemented on the Manufacturing Cloud to support the whole supply chain. Costly resources are shared on the network. This means that the utilisation rate of rarely used equipment rises and the cost of expensive equipment is reduced. According to the concept of Cloud technology, there will not be direct interaction between Cloud Users and Service Providers. The Cloud User should neither manage nor control the infrastructure and manufacturing applications. As a matter of fact, the former can be considered part of the latter. In CMfg system, various manufacturing resources and abilities can be intelligently sensed and connected into wider Internet, and automatically managed and controlled using IoT technologies (e.g., RFID, wired and wireless sensor network, embedded system). Then the manufacturing resources and abilities are virtualized and encapsulated into different manufacturing cloud services (MCSs), that can be accessed, invoked, and deployed based on knowledge by using virtualization technologies, service-oriented technologies, and cloud computing technologies. The MCSs are classified and aggregated according to specific rules and algorithms, and different kinds of manufacturing clouds are constructed. Different users can search and invoke the qualified MCSs from related manufacturing cloud according to their needs, and assemble them to be a virtual manufacturing environment or solution to complete their manufacturing task involved in the whole life cycle of manufacturing processes under the support of cloud computing, service-oriented technologies, and advanced computing technologies. Four types of cloud deployment modes (public, private, community and hybrid clouds) are ubiquitous as a single point of access. Private cloud refers to a centralized management effort in which manufacturing services are shared within one company or its subsidiaries. Enterprises' mission-critical and core-business applications are often kept in a private cloud. Community cloud is a collaborative effort in which manufacturing services are shared between several organizations from a specific community with common concerns. Public cloud realizes the key concept of sharing services with the general public in a multi-tenant environment. Hybrid cloud is a composition of two or more clouds (private, community or public) that remain distinct entities but are also bound together, offering the benefits of multiple deployment modes. == Resources == From the resource’s perspective, each kind of manufacturing capability requires support from the related manufacturing resource. For each type of manufacturing capability, its related manufacturing resource comes in two forms, soft resources and hard resources. === Soft resources === Software: software applications throughout the product lifecycle including design, analysis, simulation, process planning, and are only beginning to be embraced by the electronics manufacturing industry. Knowledge: experience and know-how needed to complete a production task, i.e. engineering knowledge, product models, standards, evaluation procedures and results, customer feedback, and manufacturing in the cloud provides just as many solutions as the number of questions it also raises for manufacturing executives wanting to make the best possible decision. Skill: expertise in performing a specific manufacturing task. Personnel: human resource engaged in the manufacturing process, i.e. designers, operators, managers, technicians, project teams, customer service, etc. Experience: performance, quality, client evaluation, etc. Business Network: business relationships and business opportunity networks that exist in an enterprise. === Hard resources === Manufacturing Equipment: facilities needed for completing a manufacturing task, e.g. machine tools, cutters, test and monitoring equipment and other fabrication tools. Monitoring/Control Resource: devices used to identify and control other manufacturing resource, for instance, RFID (Radio-Frequency IDentification), WSN (Wireless Sensor Network), virtual managers and remote controllers. Computational Resource: computing devices to support production process, e.g. servers, computers, storage media, control devices, etc. Materials: inputs and outputs in a production system, e.g. raw material, product-in-progress, finished product, power, water, lubricants, etc. Storage: automated storage and retrieval systems, logic controllers, location of warehouses, volume capacity and schedule/optimization methods. Transportation: movement of manufacturing inputs/outputs from one location to another. It includes the modes of transport, e.g. air, rail, road, water, cable, pipeline and space, and the related price, and time taken.

    Read more →
  • Cloud printing

    Cloud printing

    There are, in essence, three kinds of Cloud printing. == Benefits == 76% of IT teams have moved, or plan to move, their print workflows to the cloud due to its simplicity. Consumers can print easily to any printer from their PC, tablet or smartphone, while the Cloud print service monitors the supplies level. Many printer vendors such as Lexmark propose an automatic supplies shipment based on the real-time analysis of the printer supplies and user behavior to ensure printing will always be possible. For IT department, Cloud Printing eliminates the need for print servers and represents the only way to print from Cloud virtual desktops and servers. For consumers, cloud ready printers eliminate the need for PC connections and print drivers, enabling them to print from mobile devices. As for publishers and content owners, cloud printing allows them to "avoid the cost and complexity of buying and managing the underlying hardware, software and processes" required for the production of professional print products. Leveraging cloud print for print on demand also allows businesses to cut down on the costs associated with mass production. Moreover, cloud printing can be considered more eco-friendly, as it significantly reduces the amount of paper used (13% reduction in print jobs yearly) and lowers carbon emissions from transportation. As many companies move their IT to the Cloud, some adopting the Windows 365 and Azure Virtual Desktop services from Microsoft, the connection from the Cloud environment to the on-premise printers become an issue as opening ports for incoming print flow traffic is not an option. In 2020, at the exact same time Google discontinued its Google Print offer, Microsoft has announced its Universal Print service offer, aimed at making printing compatible with Cloud Desktop environments, making printing driver-free and simple with no client to install on PC. With Universal Print Microsoft has built a disrupting architecture with a value proposition commodifying printers, removing print servers and drivers, allowing to move printers to VLAN for security purpose and printing from anywhere. Clients are free to use any printer from any model as they all work the same, clients are not tied anymore to any printer brand and that gave a significant boost to the Cloud print market. That Microsoft Universal Print architecture provides APIs to third-party developers who can develop add-ons such as Celiveo 365 to extend Microsoft Cloud Print with added features such as access control on printers and copiers, follow-me pull print, data encryption, advanced usage reporting or charge back. == Providers of Consumer Cloud Printing Solutions == Before 2020 only a handful of providers used to work towards a professional cloud print solution, operating in their own niche or focus on mobile devices. In 2020 Microsoft has boosted that market by announcing its Universal Print Cloud printing service and since then many publishers have started to propose solutions for that growing market. The Covid pandemic also created the need for employees to be able to print at home when using the corporate IT software. Closed VPN often prevent accessing home network printers from corporate laptops and Full Public Cloud solutions are meant to be a solution to that problem. After the decision by Google to terminate Google Cloud Print service on 31 December 2020, most printer vendors released their own mobile cloud solution to fill the gap, while Hewlett-Packard implemented its own cloud print with their ePrint solution. Those solutions are often proprietary, only working on printers proposed by the vendor. Google has decided to let third-party developers develop Cloud Print solutions and to limit its scope to certifying the best Print Management offers compatible with its Chrome Enterprise Cloud ecosystem. == Providers of Corporate Cloud Printing solutions == While many print solutions claim to be "Cloud Printing", there are actually three categories: full Private Cloud, full Public Cloud, and Hybrid Cloud. Their differences are real and have an impact on the overall TCO as the more software there is on-site, the more hidden cost there are. In the Full Public Cloud category, independent SaaS vendors like Celiveo, ezeep , Printix , and Y Soft support a wide range of printer brands and models, allowing clients to buy the best printer without being locked on any brand. They are leveraging cloud computing technology to offer cloud-based print infrastructure and cloud-based printing software as a Service (SaaS). These solutions have integrations to cloud enabled printers or provide embedded printer agents. They feature allow users to print to any printer in any network, isolated network or not, even if that printer is otherwise not reachable from the user's computer. This also allows IT departments to move printers to VLAN for maximum security, like what they are doing with IP phones. Google Chrome Enterprise Cloud ecosystem has its own technical particularities and Google certifies Print Management solutions, ensuring they comply with Google technical requirement, yet letting each solution differentiate from others with specific features or security. Many of solutions for Chrome Enterprise are Hybrid, a few are Full Public Cloud. Industry experts believe that as these services become more popular, users will no longer consider printers as necessary assets but rather as devices that they can access on demand when the need to generate a printed page presents itself. == Caveats of Cloud Printing == == Security == Print jobs flow through Public Internet. It is therefore important to verify no Man-in-the-Middle attack can be performed. The only technical solution is to ensure each printer and PC uses a non-self-generated cryptographic token or certificate allowing TLS mutual authentication and specific data encryption. Self-generated printer certificates are unknown from the Cloud and prevent trusted authentication. Microsoft has implemented its Zero Trust Access security in its Universal Print service, it generates a unique certificate on printers compatible with its service. Other Cloud Printing SaaS providers have followed Microsoft on that High Security path. Print jobs data stored on the Cloud is sensitive as it contains user information as well as all information appearing on pages. Good practices require such data is encrypted at rest and in motion, using asymmetric PKI keys instead of fixed encryption keys. Some solutions require to open incoming traffic ports on the firewall to let Cloud services communicate with printers attached behind that firewall (most of the time for IPP/IPPS flows), some other solutions use a pull model where the communication is always initiated by the printer and no firewall port needs to be open. In terms of security the later is to be preferred.

    Read more →
  • Data cube

    Data cube

    In computer programming, a data cube (or datacube) is a multi-dimensional array of values. Typically, the term "data cube" is applied in contexts where these arrays are massively larger than the hosting computer's main memory; examples include multi-terabyte/petabyte data warehouses and time series of image data. Even though it is called a cube, a data cube generally is a multi-dimensional concept which can be 1-dimensional, 2-dimensional, 3-dimensional, or higher-dimensional. The data cube is used to represent data (sometimes called facts) along some dimensions of interest. In satellite image timeseries, dimensions would be latitude and longitude coordinates and time; a fact (sometimes called measure) would be a pixel at a given space and time as taken by the satellite. For example, in online analytical processing, an OLAP cube about a company would have dimensions that could be the company subsidiaries, the company products, and time; in this setup, a fact would be a sales event where a particular product has been sold in a particular subsidiary at a particular time. In any case, every dimension divides data into groups of cells whereas each cell in the cube represents a single measure of interest. Sometimes cubes hold only a few values with the rest being empty, i.e. undefined, while sometimes most or all cube coordinates hold a cell value. In the first case such data are called sparse, and in the second case they are called dense, although there is no hard delineation between the two. Data cubes may be stored in database management systems (DBMS) as part of array DBMS. Spatio-temporal databases and geospatial databases may also be represented as coverage data. == History == Multi-dimensional arrays have long been familiar in programming languages. Fortran offers arbitrarily-indexed 1-D arrays and arrays of arrays, which allows the construction of higher-dimensional arrays, up to 15 dimensions. APL supports n-D arrays with a rich set of operations. All these have in common that arrays must fit into the main memory and are available only while the particular program maintaining them (such as image processing software) is running. A series of data exchange formats support storage and transmission of data cube-like data, often tailored towards particular application domains. Examples include MDX for statistical (in particular, business) data, Zarr and Hierarchical Data Format for general scientific data, and TIFF for imagery. In 1992, Peter Baumann introduced management of massive data cubes with high-level user functionality combined with an efficient software architecture. Datacube operations include subset extraction, processing, fusion, and in general queries in the spirit of data manipulation languages like SQL. Some years after, the data cube concept was applied to describe time-varying business data as data cubes by Jim Gray, et al., and by Venky Harinarayan, Anand Rajaraman and Jeff Ullman. Around that time, a working group on Multi-Dimensional Databases ("Arbeitskreis Multi-Dimensionale Datenbanken") was established at German Gesellschaft für Informatik. Datacube Inc. was an image processing company selling hardware and software applications for the PC market in 1996, however without addressing data cubes as such. The EarthServer initiative has established geo data cube service requirements. == Standardization == In 2018, the ISO SQL database language was extended with data cube functionality as "SQL – Part 15: Multi-dimensional arrays (SQL/MDA)". Web Coverage Processing Service is a geo data cube analytics language issued by the Open Geospatial Consortium in 2008. In addition to the common data cube operations, the language knows about the semantics of space and time and supports both regular and irregular grid data cubes, based on the concept of coverage data. An industry standard for querying business data cubes, originally developed by Microsoft, is MultiDimensional eXpressions. == Implementation == Many high-level computer languages treat data cubes and other large arrays as single entities distinct from their contents. These languages, of which Fortran, APL, IDL, NumPy, PDL, and S-Lang are examples, allow the programmer to manipulate complete film clips and other data en masse with simple expressions derived from linear algebra and vector mathematics. Some languages (such as PDL) distinguish between a list of images and a data cube, while many (such as IDL) do not. Array DBMSs (Database Management Systems) offer a data model which generically supports definition, management, retrieval, and manipulation of n-dimensional data cubes. This database category has been pioneered by the rasdaman system since 1994. == Applications == Multi-dimensional arrays can meaningfully represent spatio-temporal sensor, image, and simulation data, but also statistics data where the semantics of dimensions is not necessarily of spatial or temporal nature. Generally, any kind of axis can be combined with any other into a data cube. === Mathematics === In mathematics, a one-dimensional array corresponds to a vector, a two-dimensional array resembles a matrix; more generally, a tensor may be represented as an n-dimensional data cube. === Science and engineering === For a time sequence of color images, the array is generally four-dimensional, with the dimensions representing image X and Y coordinates, time, and RGB (or other color space) color plane. For example, the EarthServer initiative unites data centers from different continents offering 3-D x/y/t satellite image timeseries and 4-D x/y/z/t weather data for retrieval and server-side processing through the Open Geospatial Consortium WCPS geo data cube query language standard. A data cube is also used in the field of imaging spectroscopy, since a spectrally-resolved image is represented as a three-dimensional volume. Earth observation data cubes combine satellite imagery such as Landsat 8 and Sentinel-2 with Geographic information system analytics. === Business intelligence === In online analytical processing (OLAP), data cubes are a common arrangement of business data suitable for analysis from different perspectives through operations like slicing, dicing, pivoting, and aggregation.

    Read more →
  • Reparameterization trick

    Reparameterization trick

    The reparameterization trick (aka "reparameterization gradient estimator") is a technique used in statistical machine learning, particularly in variational inference, variational autoencoders, and stochastic optimization. It allows for the efficient computation of gradients through random variables, enabling the optimization of parametric probability models using stochastic gradient descent, and the variance reduction of estimators. It was developed in the 1980s in operations research, under the name of "pathwise gradients", or "stochastic gradients". Its use in variational inference was proposed in 2013. == Mathematics == Let z {\displaystyle z} be a random variable with distribution q ϕ ( z ) {\displaystyle q_{\phi }(z)} , where ϕ {\displaystyle \phi } is a vector containing the parameters of the distribution. === REINFORCE estimator === Consider an objective function of the form: L ( ϕ ) = E z ∼ q ϕ ( z ) [ f ( z ) ] {\displaystyle L(\phi )=\mathbb {E} _{z\sim q_{\phi }(z)}[f(z)]} Without the reparameterization trick, estimating the gradient ∇ ϕ L ( ϕ ) {\displaystyle \nabla _{\phi }L(\phi )} can be challenging, because the parameter appears in the random variable itself. In more detail, we have to statistically estimate: ∇ ϕ L ( ϕ ) = ∇ ϕ ∫ d z q ϕ ( z ) f ( z ) {\displaystyle \nabla _{\phi }L(\phi )=\nabla _{\phi }\int dz\;q_{\phi }(z)f(z)} The REINFORCE estimator, widely used in reinforcement learning and especially policy gradient, uses the following equality: ∇ ϕ L ( ϕ ) = ∫ d z q ϕ ( z ) ∇ ϕ ( ln ⁡ q ϕ ( z ) ) f ( z ) = E z ∼ q ϕ ( z ) [ ∇ ϕ ( ln ⁡ q ϕ ( z ) ) f ( z ) ] {\displaystyle \nabla _{\phi }L(\phi )=\int dz\;q_{\phi }(z)\nabla _{\phi }(\ln q_{\phi }(z))f(z)=\mathbb {E} _{z\sim q_{\phi }(z)}[\nabla _{\phi }(\ln q_{\phi }(z))f(z)]} This allows the gradient to be estimated: ∇ ϕ L ( ϕ ) ≈ 1 N ∑ i = 1 N ∇ ϕ ( ln ⁡ q ϕ ( z i ) ) f ( z i ) {\displaystyle \nabla _{\phi }L(\phi )\approx {\frac {1}{N}}\sum _{i=1}^{N}\nabla _{\phi }(\ln q_{\phi }(z_{i}))f(z_{i})} The REINFORCE estimator has high variance, and many methods were developed to reduce its variance. === Reparameterization estimator === The reparameterization trick expresses z {\displaystyle z} as: z = g ϕ ( ϵ ) , ϵ ∼ p ( ϵ ) {\displaystyle z=g_{\phi }(\epsilon ),\quad \epsilon \sim p(\epsilon )} Here, g ϕ {\displaystyle g_{\phi }} is a deterministic function parameterized by ϕ {\displaystyle \phi } , and ϵ {\displaystyle \epsilon } is a noise variable drawn from a fixed distribution p ( ϵ ) {\displaystyle p(\epsilon )} . This gives: L ( ϕ ) = E ϵ ∼ p ( ϵ ) [ f ( g ϕ ( ϵ ) ) ] {\displaystyle L(\phi )=\mathbb {E} _{\epsilon \sim p(\epsilon )}[f(g_{\phi }(\epsilon ))]} Now, the gradient can be estimated as: ∇ ϕ L ( ϕ ) = E ϵ ∼ p ( ϵ ) [ ∇ ϕ f ( g ϕ ( ϵ ) ) ] ≈ 1 N ∑ i = 1 N ∇ ϕ f ( g ϕ ( ϵ i ) ) {\displaystyle \nabla _{\phi }L(\phi )=\mathbb {E} _{\epsilon \sim p(\epsilon )}[\nabla _{\phi }f(g_{\phi }(\epsilon ))]\approx {\frac {1}{N}}\sum _{i=1}^{N}\nabla _{\phi }f(g_{\phi }(\epsilon _{i}))} == Examples == For some common distributions, the reparameterization trick takes specific forms: Normal distribution: For z ∼ N ( μ , σ 2 ) {\displaystyle z\sim {\mathcal {N}}(\mu ,\sigma ^{2})} , we can use: z = μ + σ ϵ , ϵ ∼ N ( 0 , 1 ) {\displaystyle z=\mu +\sigma \epsilon ,\quad \epsilon \sim {\mathcal {N}}(0,1)} Exponential distribution: For z ∼ Exp ( λ ) {\displaystyle z\sim {\text{Exp}}(\lambda )} , we can use: z = − 1 λ log ⁡ ( ϵ ) , ϵ ∼ Uniform ( 0 , 1 ) {\displaystyle z=-{\frac {1}{\lambda }}\log(\epsilon ),\quad \epsilon \sim {\text{Uniform}}(0,1)} Discrete distribution can be reparameterized by the Gumbel distribution (Gumbel-softmax trick or "concrete distribution") and diffusion models. In general, any distribution that is differentiable with respect to its parameters can be reparameterized by inverting the multivariable CDF function, then apply the implicit method. See for an exposition and application to the Gamma, Beta, Dirichlet, and von Mises distributions. == Applications == === Variational autoencoder === In Variational Autoencoders (VAEs), the VAE objective function, known as the Evidence Lower Bound (ELBO), is given by: ELBO ( ϕ , θ ) = E z ∼ q ϕ ( z | x ) [ log ⁡ p θ ( x | z ) ] − D KL ( q ϕ ( z | x ) | | p ( z ) ) {\displaystyle {\text{ELBO}}(\phi ,\theta )=\mathbb {E} _{z\sim q_{\phi }(z|x)}[\log p_{\theta }(x|z)]-D_{\text{KL}}(q_{\phi }(z|x)||p(z))} where q ϕ ( z | x ) {\displaystyle q_{\phi }(z|x)} is the encoder (recognition model), p θ ( x | z ) {\displaystyle p_{\theta }(x|z)} is the decoder (generative model), and p ( z ) {\displaystyle p(z)} is the prior distribution over latent variables. The gradient of ELBO with respect to θ {\displaystyle \theta } is simply E z ∼ q ϕ ( z | x ) [ ∇ θ log ⁡ p θ ( x | z ) ] ≈ 1 L ∑ l = 1 L ∇ θ log ⁡ p θ ( x | z l ) {\displaystyle \mathbb {E} _{z\sim q_{\phi }(z|x)}[\nabla _{\theta }\log p_{\theta }(x|z)]\approx {\frac {1}{L}}\sum _{l=1}^{L}\nabla _{\theta }\log p_{\theta }(x|z_{l})} but the gradient with respect to ϕ {\displaystyle \phi } requires the trick. Express the sampling operation z ∼ q ϕ ( z | x ) {\displaystyle z\sim q_{\phi }(z|x)} as: z = μ ϕ ( x ) + σ ϕ ( x ) ⊙ ϵ , ϵ ∼ N ( 0 , I ) {\displaystyle z=\mu _{\phi }(x)+\sigma _{\phi }(x)\odot \epsilon ,\quad \epsilon \sim {\mathcal {N}}(0,I)} where μ ϕ ( x ) {\displaystyle \mu _{\phi }(x)} and σ ϕ ( x ) {\displaystyle \sigma _{\phi }(x)} are the outputs of the encoder network, and ⊙ {\displaystyle \odot } denotes element-wise multiplication. Then we have ∇ ϕ ELBO ( ϕ , θ ) = E ϵ ∼ N ( 0 , I ) [ ∇ ϕ log ⁡ p θ ( x | z ) + ∇ ϕ log ⁡ q ϕ ( z | x ) − ∇ ϕ log ⁡ p ( z ) ] {\displaystyle \nabla _{\phi }{\text{ELBO}}(\phi ,\theta )=\mathbb {E} _{\epsilon \sim {\mathcal {N}}(0,I)}[\nabla _{\phi }\log p_{\theta }(x|z)+\nabla _{\phi }\log q_{\phi }(z|x)-\nabla _{\phi }\log p(z)]} where z = μ ϕ ( x ) + σ ϕ ( x ) ⊙ ϵ {\displaystyle z=\mu _{\phi }(x)+\sigma _{\phi }(x)\odot \epsilon } . This allows us to estimate the gradient using Monte Carlo sampling: ∇ ϕ ELBO ( ϕ , θ ) ≈ 1 L ∑ l = 1 L [ ∇ ϕ log ⁡ p θ ( x | z l ) + ∇ ϕ log ⁡ q ϕ ( z l | x ) − ∇ ϕ log ⁡ p ( z l ) ] {\displaystyle \nabla _{\phi }{\text{ELBO}}(\phi ,\theta )\approx {\frac {1}{L}}\sum _{l=1}^{L}[\nabla _{\phi }\log p_{\theta }(x|z_{l})+\nabla _{\phi }\log q_{\phi }(z_{l}|x)-\nabla _{\phi }\log p(z_{l})]} where z l = μ ϕ ( x ) + σ ϕ ( x ) ⊙ ϵ l {\displaystyle z_{l}=\mu _{\phi }(x)+\sigma _{\phi }(x)\odot \epsilon _{l}} and ϵ l ∼ N ( 0 , I ) {\displaystyle \epsilon _{l}\sim {\mathcal {N}}(0,I)} for l = 1 , … , L {\displaystyle l=1,\ldots ,L} . This formulation enables backpropagation through the sampling process, allowing for end-to-end training of the VAE model using stochastic gradient descent or its variants. === Variational inference === More generally, the trick allows using stochastic gradient descent for variational inference. Let the variational objective (ELBO) be of the form: ELBO ( ϕ ) = E z ∼ q ϕ ( z ) [ log ⁡ p ( x , z ) − log ⁡ q ϕ ( z ) ] {\displaystyle {\text{ELBO}}(\phi )=\mathbb {E} _{z\sim q_{\phi }(z)}[\log p(x,z)-\log q_{\phi }(z)]} Using the reparameterization trick, we can estimate the gradient of this objective with respect to ϕ {\displaystyle \phi } : ∇ ϕ ELBO ( ϕ ) ≈ 1 L ∑ l = 1 L ∇ ϕ [ log ⁡ p ( x , g ϕ ( ϵ l ) ) − log ⁡ q ϕ ( g ϕ ( ϵ l ) ) ] , ϵ l ∼ p ( ϵ ) {\displaystyle \nabla _{\phi }{\text{ELBO}}(\phi )\approx {\frac {1}{L}}\sum _{l=1}^{L}\nabla _{\phi }[\log p(x,g_{\phi }(\epsilon _{l}))-\log q_{\phi }(g_{\phi }(\epsilon _{l}))],\quad \epsilon _{l}\sim p(\epsilon )} === Dropout === The reparameterization trick has been applied to reduce the variance in dropout, a regularization technique in neural networks. The original dropout can be reparameterized with Bernoulli distributions: y = ( W ⊙ ϵ ) x , ϵ i j ∼ Bernoulli ( α i j ) {\displaystyle y=(W\odot \epsilon )x,\quad \epsilon _{ij}\sim {\text{Bernoulli}}(\alpha _{ij})} where W {\displaystyle W} is the weight matrix, x {\displaystyle x} is the input, and α i j {\displaystyle \alpha _{ij}} are the (fixed) dropout rates. More generally, other distributions can be used than the Bernoulli distribution, such as the gaussian noise: y i = μ i + σ i ⊙ ϵ i , ϵ i ∼ N ( 0 , I ) {\displaystyle y_{i}=\mu _{i}+\sigma _{i}\odot \epsilon _{i},\quad \epsilon _{i}\sim {\mathcal {N}}(0,I)} where μ i = m i ⊤ x {\displaystyle \mu _{i}=\mathbf {m} _{i}^{\top }x} and σ i 2 = v i ⊤ x 2 {\displaystyle \sigma _{i}^{2}=\mathbf {v} _{i}^{\top }x^{2}} , with m i {\displaystyle \mathbf {m} _{i}} and v i {\displaystyle \mathbf {v} _{i}} being the mean and variance of the i {\displaystyle i} -th output neuron. The reparameterization trick can be applied to all such cases, resulting in the variational dropout method.

    Read more →
  • Hamilton C shell

    Hamilton C shell

    Hamilton C shell is a clone of the Unix C shell and utilities for Microsoft Windows created by Nicole Hamilton at Hamilton Laboratories as a completely original work, not based on any prior code. It was first released on OS/2 on December 12, 1988 and on Windows NT in July 1992. The OS/2 version was discontinued in 2003 but the Windows version continues to be actively supported. == Design == Hamilton C shell differs from the Unix C shell in several respects. These include its compiler architecture, its use of threads, and the decision to follow Windows rather than Unix conventions. === Parser === The original C shell uses an ad hoc parser. This has led to complaints about its limitations. It works well enough for the kinds of things users type interactively but not very well for the more complex commands a user might take time to write in a script. It is not possible, for example, to pipe the output of a foreach statement into grep. There was a limit to how complex a command it could handle. By contrast, Hamilton uses a top-down recursive descent parser that allows it to compile statements to an internal form before running them. As a result, statements can be nested or piped arbitrarily. The language has also been extended with built-in and user-defined procedures, local variables, floating point and additional expression, editing and wildcarding operators, including an "indefinite directory" wildcard construct written as "..." that matches zero or more directory levels as required to make the rest of the pattern match. === Threads === Lacking fork or a high performance way to recreate that functionality, Hamilton uses the Windows threads facilities instead. When a new thread is created, it runs within the same process space and it shares all of the process state. If one thread changes the current directory or the contents of memory, it's changed for all the threads. It's much cheaper to create a thread than a process but there's no isolation between them. To recreate the missing isolation of separate processes, the threads cooperate to share resources using locks. === Windows conventions === Hamilton differs from other Unix shells in that it also directly supports Windows conventions for drive letters, filename slashes, escape characters, etc.

    Read more →
  • Thai QR Payment

    Thai QR Payment

    Thai QR Payment or PromptPay (พร้อมเพย์) is a real-time payment system in Thailand that allows money transfers through digital channels using identifiers linked to a bank account, including a mobile phone number, citizen identification number, tax identification number or bank account number. The system was introduced in 2016 as part of Thailand's national e-payment infrastructure and was developed under the National e-Payment Master Plan, a government programme intended to expand digital payment infrastructure and reduce the use of cash in everyday transactions. It is owned by National ITMX ltd and Bank of Thailand and developed by Vocalink, a group by Mastercard == History == PromptPay (originally AnyID) is one of the National e-Payment projects and policies by Thailand, to regulate and standardize electronic payments to follow the technologies with internet and smartphones that is expanding and bringing technology into Finance and Commerce. By 22 December 2015, The First Prayut cabinet have approved the project as a national infastructure PromptPay has also been used in cross-border payment linkages with other real-time payment systems in Southeast Asia. In April 2021, the Monetary Authority of Singapore and the Bank of Thailand launched a linkage between Singapore's PayNow and Thailand's PromptPay, allowing customers of participating banks to send money between the two countries using a mobile phone number. In June 2021, the central banks of Thailand and Malaysia launched a cross-border QR payment linkage between PromptPay and Malaysia's DuitNow system. == Services == PromptPay's Services have included Encrypted Transactions and Payment between Two Individuals (C2C) Government Infrastructure Payment Tax Returns Individual PromptPay e-Wallet Thai QR Payment Pay Alert e-Donation Cross Border QR Payment

    Read more →
  • Cloud-native computing

    Cloud-native computing

    Cloud native computing is an approach in software development that utilizes cloud computing to "build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds". These technologies, such as containers, microservices, serverless functions, cloud native processors and immutable infrastructure, deployed via declarative code are common elements of this architectural style. Cloud native technologies focus on minimizing users' operational burden. Cloud native techniques "enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil." This independence contributes to the overall resilience of the system, as issues in one area do not necessarily cripple the entire application. Additionally, such systems are easier to manage, and monitor, given their modular nature, which simplifies tracking performance and identifying issues. Frequently, cloud-native applications are built as a set of microservices that run in Open Container Initiative compliant containers, such as Containerd, and may be orchestrated in Kubernetes and managed and deployed using DevOps and Git CI workflows (although there is a large amount of competing open source that supports cloud-native development). The advantage of using containers is the ability to package all software needed to execute into one executable package. The container runs in a virtualized environment, which isolates the contained application from its environment.

    Read more →
  • CMU Pronouncing Dictionary

    CMU Pronouncing Dictionary

    The CMU Pronouncing Dictionary (also known as CMUdict) is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research. CMUdict provides a mapping orthographic/phonetic for English words in their North American pronunciations. It is commonly used to generate representations for speech recognition (ASR), e.g. the CMU Sphinx system, and speech synthesis (TTS), e.g. the Festival system. CMUdict can be used as a training corpus for building statistical grapheme-to-phoneme (g2p) models that will generate pronunciations for words not yet included in the dictionary. The most recent release is 0.7b; it contains over 134,000 entries. An interactive lookup version is available. == Database format == The database is distributed as a plain text file with one entry to a line in the format "WORD " with a two-space separator between the parts. If multiple pronunciations are available for a word, variants are identified using numbered versions (e.g. WORD(1)). The pronunciation is encoded using a modified form of the ARPABET system, with the addition of stress marks on vowels of levels 0, 1, and 2. A line-initial ;;; token indicates a comment. A derived format, directly suitable for speech recognition engines is also available as part of the distribution; this format collapses stress distinctions (typically not used in ASR). The following is a table of phonemes used by CMU Pronouncing Dictionary. == History == == Applications == The Unifon converter is based on the CMU Pronouncing Dictionary. The Natural Language Toolkit contains an interface to the CMU Pronouncing Dictionary. The Carnegie Mellon Logios tool incorporates the CMU Pronouncing Dictionary. PronunDict, a pronunciation dictionary of American English, uses the CMU Pronouncing Dictionary as its data source. Pronunciation is transcribed in IPA symbols. This dictionary also supports searching by pronunciation. Some singing voice synthesizer software like CeVIO Creative Studio and Synthesizer V uses modified version of CMU Pronouncing Dictionary for synthesizing English singing voices. Transcriber, a tool for the full text phonetic transcription, uses the CMU Pronouncing Dictionary 15.ai, a real-time text-to-speech tool using artificial intelligence, uses the CMU Pronouncing Dictionary

    Read more →
  • Pandas (software)

    Pandas (software)

    Pandas (styled as pandas) is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. It is free software released under the three-clause BSD license. The name is derived from the term "panel data", an econometrics term for data sets that include observations over multiple time periods for the same individuals, as well as a play on the phrase "Python data analysis". Wes McKinney started building what would become Pandas at AQR Capital while he was a researcher there from 2007 to 2010. The development of Pandas introduced into Python many comparable features of working with DataFrames that were established in the R programming language. The library is built upon another library, NumPy. == History == Developer Wes McKinney started working on Pandas in 2008 while at AQR Capital Management out of the need for a high performance, flexible tool to perform quantitative analysis on financial data. Before leaving AQR, he was able to convince management to allow him to open source the library in 2009. Another AQR employee, Chang She, joined the effort in 2012 as the second major contributor to the library. In 2015, Pandas signed on as a fiscally sponsored project of NumFOCUS, a 501(c)(3) nonprofit charity in the United States. == Data model == Pandas is built around data structures called Series and DataFrames. Data for these collections can be imported from various file formats such as comma-separated values, JSON, Parquet, SQL database tables or queries, and Microsoft Excel. === Series === A Series is a one-dimensional array-like object that stores a sequence of values together with an associated set of labels, called an index. It is built on top of NumPy's array and affords many similar functionalities, but instead of using implicit integer positions, a Series allows explicit index labels of many data types. A Series can be created from Python lists, dictionaries, or NumPy arrays. If no index is provided, pandas automatically assigns a default integer index ranging from 0 to n-1, where n is the number of items in the Series. A simple example with customized labels is: To access a value or list of values from a Series, use its index or list of indices: Series can be used arithmetically, as in the statement series_3 = series_1 + series_2. This will align data points with corresponding index values in series_1 and series_2 (similar to a join in relational algebra), then add them together to produce new values in series_3. A Series has various attributes, such as name (Series name), dtype (data type of values), shape (number of rows), values, and index. They can be used in many of the same operations as NumPy arrays, with additional methods for reindexing, label-based selection, and handling missing data. === DataFrame === A DataFrame is a two-dimensional, tabular data structure with labeled rows and columns. Each column is stored internally as a Series and may hold a different data type (numeric, string, boolean, etc.). DataFrames can be created by a variety of means, including dictionaries of lists, NumPy arrays, and external files such as CSV or Excel spreadsheets: To retrieve a DataFrame column as a Series, use either 1) the index (dict-like notation) or 2) the name of column if the name is a valid Python identifier (attribute-like access). DataFrames support operations such as column assignment, row and column deletion, label-based indexing with loc, position-based indexing with iloc, reshaping, grouping, and joining. Merge operations implement a subset of relational algebra and allow one-to-one, many-to-one, and many-to-many joins. Some common attributes of a DataFrame include dtypes (data type of each column), shape (dimensions of the DataFrame returned as a tuple with form (number of rows, number of columns)), index/columns (labels of the DataFrame's rows/columns, respectively, returned as an Index object), values (data in the DataFrame returned as a 2D array), and empty (returns True if the DataFrame is empty). === Index === Index objects hold metadata for Series and Dataframe objects, such as axis labels and names, and are automatically created from input data. By default, a pandas index is a series of integers ascending from 0, similar to the indices of Python arrays. However, indices can also use any NumPy data type, including floating point, timestamps, or strings. Indices are also immutable, which allows them to be safely shared across multiple objects. pandas' syntax for mapping index values to relevant data is the same syntax Python uses to map dictionary keys to values. For example, if s is a Series, s['a'] will return the data point at index a. Unlike dictionary keys, index values are not guaranteed to be unique. If a Series uses the index value a for multiple data points, then s['a'] will instead return a new Series containing all matching values. A DataFrame's column names are stored and implemented identically to an index. As such, a DataFrame can be thought of as having two indices: one column-based and one row-based. Because column names are stored as an index, these are not required to be unique. If data is a Series, then data['a'] returns all values with the index value of a. However, if data is a DataFrame, then data['a'] returns all values in the column(s) named a. To avoid this ambiguity, Pandas supports the syntax data.loc['a'] as an alternative way to filter using the index. Pandas also supports the syntax data.iloc[n], which always takes an integer n and returns the nth value, counting from 0. This allows a user to act as though the index is an array-like sequence of integers, regardless of how it is actually defined. pandas also supports hierarchical indices with multiple values per data point through the "MultiIndex" class. MultiIndex objects allow a single DataFrame to represent multiple dimensions, similar to a pivot table in Microsoft Excel, where each level can optionally carry its own unique name. In practice, data with more than 2 dimensions is often represented using DataFrames with hierarchical indices, instead of the higher-dimension Panel and Panel4D data structures. == Functionality == pandas supports a variety of indexing and subsetting techniques, allowing data to be selected by label, index, or Boolean conditions. For example, df[df['col1'] > 5] will return all rows in the DataFrame df for which the value of the column col1 exceeds 5. The library also implements grouping operations based on the split-apply-combine approach, enabling users to aggregate, transform, or restructure data according to column values or functions applied to index labels. For example, df['col1'].groupby(df['col2']) groups the data in 'col1' by their values in 'col2', while df.groupby(lambda i: i % 2) groups all data in the whole DataFrame by whether their index is even. The library also provides extensive tools for transforming, filtering and summarizing data. Users may apply arbitrary functions to Series and DataFrames, and because the library is built on top of Numpy, most NumPy functions can be applied directly to pandas objects as well. The library also includes built-in operations for arithmetic operations, string processing, and descriptive statistics such as mean, median, and standard deviation. These built-in functions are designed to handle missing data, usually represented by the floating-point value NaN. In addition, pandas includes tools for reorganizing data into different structural formats, with methods that can reshape tabular data between "wide" and "long" formats and pivot values based on column labels. pandas also implements a flexible set of relational operations for combining datasets. For instance, merge() links row in DataFrames based on one or more shared keys or indices, supporting one-to-one, one-to-many, and many-to-many relationships in a manner analogous to join operations in relational databases like SQL. DataFrames can also be concatenated or stacked together along an axis through the concat() method, and overlapping data can be further spliced together using combine_first() to fill in missing values. Furthermore, the library includes specialized support for working with time-series data. Features include the ability to interpolate values and filter using a range of timestamps, such as data['1/1/2023':'2/2/2023'] , which will return all dates between January 1 and February 2. Missing values in time-series data are represented by a dedicated NaT (Not a Timestamp) object, instead of the NaN value it uses elsewhere. == Criticisms == Pandas has been criticized for its inefficiency. The entire dataset must be loaded in RAM, and the library does not optimize query plans or support parallel computing across multiple cores. Wes McKinney, the creator of Pandas, has recommended Apache Arrow as an alternative to address these performance concerns and ot

    Read more →
  • Process map

    Process map

    Process map is a global-system process model that is used to outline the processes that make up the business system and how they interact with each other. Process map shows the processes as objects, which means it is a static and non-algorithmic view of the processes. It should be differentiated from a detailed process model, which shows a dynamic and algorithmic view of the processes, usually known as a process flow diagram. There are different notation standards that can be used for modelling process maps, but the most notable ones are TOGAF Event Diagram, Eriksson-Penker notation, and ARIS Value Added Chain. == Global process models == Global characteristics of the business system are captured by global or system models. Global process models are presented using different methodologies and sometimes under different names. Most notably, they are named process map in Visual Paradigm and MMABP, value-added chain in ARIS, and process diagram in Eriksson-Penker notation – which can easily lead to the confusion with process flow (detailed process model). Global models are mainly object-oriented and present a static view of the business system; they do not describe dynamic aspects of processes. A process map shows the presence of processes and their mutual relationships. The requirement for the global perspective of the system as a supplementary to the internal process logic description results from the necessity of taking into consideration not only the internal process logic but also its significant surroundings. The algorithmic process model cannot take the place of this perspective since it represents the system model of the process. The detailed process model and the global process model represent different perspectives on the same business system, so these models must be mutually consistent. A macro process map represents the major processes required to deliver a product or service to the customer. These macro process maps can be further detailed in sub-diagrams. It is often the case that process maps cross different functional areas of the organization. Process maps are used by many companies to have a holistic view of all processes and the connections between them. Maps help in navigating the sub-processes and make understanding of the organization's operations easier. The process map shows relationships and dependencies between processes and its focus should be on core business processes of the organization. A process map can be seen as the most abstract level of the process architecture, and it acts as the introduction to the more detailed levels. A process map that is correctly designed is able to provide a general understanding of a company's operations. Designing the process map is an important and strategic step for the organization, and it is followed by further business process modelling implementation. == Context == Methodology for Modelling and Analysis of Business Process (MMABP) is a business process modelling methodology developed at the Department of Information Technology, Faculty of Informatics and Statistics of the Prague University of Economics and Business. The methodology is defined as a “general methodology for modelling business systems using informatics methods and approaches”. Methodology is used to analyse business processes and to develop a comprehensive model of the system. The goal of developing a model is to be used for process optimization. The model should be created following the characteristics and specifics of the organization in question and following external influences that can affect the organization. The model should be optimal from an economic perspective, but it should also be optimal from a factual perspective, meaning that it should be as simple as possible while maintaining complete functionality. Business system modelling is based on a two-dimensional approach: Real World structure (substance) – set of objects and their relationships Real World behaviour – set of mutually connected business processes Additionally, there are also two views of the systems: Global view of the system Detailed view of the system's parts This results in the need to model the system from four different perspectives in order to achieve the complete and comprehensive view of the business system. MMABP also proposes which notation languages can be used for modelling each perspective, and it also suggests some improvements to the notation languages in order to fit the purpose. Global view of the objects – Conceptual model (Class diagram) Detailed view of the objects – Object life cycle (State Chart) Global view of the processes – Process map (Eriksson-Penker Diagram/TOGAF Event Diagram/ARIS VAC) Detailed view of the processes – Model of the process flow (BPMN Diagram) Data Flow Diagram (DFD) is additional diagram used for describing the required functionalities of the information system. == Notation standards == === Eriksson-Penker Diagram === Eriksson-Penker diagram is a tool used in business model analysis and design. It is named after Hans-Erik Eriksson and Magnus Penker, who developed the concept in their book "Business modelling with UML: Business Patterns at Work”. Eriksson-Penker diagrams are used to map out the key components of a business model and how they interact with one another. The diagrams typically consist of a series of boxes and lines that represent the different elements of the business model, such as the value proposition, customer segments, channels, revenue streams, and key resources. The lines between the boxes represent the relationships and dependencies between the different elements of the business model. These diagrams are useful for visualizing and understanding the various components of a business model, and can help organizations identify potential areas for improvement or areas of risk. They can also be used as a communication tool to help stakeholders understand the business model and its underlying assumptions. These diagrams are useful for visualizing and understanding the various components of a business model, and can help organizations identify potential areas for improvement or areas of risk. They can also be used as a communication tool to help stakeholders understand the business model and its underlying assumptions. It is possible to use Eriksson-Penker diagrams to create a global process view of a business. In this case, a diagram would be used to map out the key processes and activities that are involved in the business, as well as the relationships and dependencies between these processes. For example, an Eriksson-Penker diagram could be used to depict the various steps involved in the product development process, from concept development to market launch. It could also be used to show how different functions within the organization, such as marketing, sales, and production, interact and depend on one another to support the overall business. Eriksson-Penker diagram is one of the most popular de facto standards that can be used for an object-oriented global view of business processes. It is developed as an extension of the UML, and it is often used together with the BPMN to compensate for the lack of possibility to model the global view with this widely accepted standard. === TOGAF Event Diagram === TOGAF (The Open Group Architecture Framework) is a framework for enterprise architecture that provides a common language and set of standards for designing, planning, implementing, and governing an enterprise's IT architecture. TOGAF event diagrams are diagrams used in the TOGAF framework to represent the flow of events within a system or process. The TOGAF Event Diagram is a visual representation of the events within an organization or system. It can be used to show the sequence of events that occur in a particular process, as well as the relationships between the events and the stakeholders involved. TOGAF Event Diagrams can be useful in creating a global process view because they provide a visual representation of the events, which can be helpful in understanding how the process fits into the larger context of the organization. TOGAF Event Diagram is the most perspective standard for the system view of processes today. It is used to represent the system of processes as well as their connections to the functional organizational structure. === ARIS Value Added Chain === ARIS (Architecture of Integrated Information Systems) is a methodology and a set of tools for designing and managing business processes. It is based on the idea that business processes are the core of an organization and that they can be modelled and optimized to improve efficiency and effectiveness. The ARIS methodology provides a framework for understanding and analysing business processes, as well as for designing and implementing improvements to those processes. It includes a set of graphical modelling languages and tools for creating process models, as well as a database for storing and managing pr

    Read more →
  • Deep image compositing

    Deep image compositing

    Deep image compositing is a way of compositing and rendering digital images that emerged in the mid-2010s. In addition to the usual color and opacity channels a notion of spatial depth is created. This allows multiple samples in the depth of the image to make up the final resulting color. This technique produces high quality results and removes artifacts around edges that could not be dealt with otherwise. == Deep data == Deep data is encoded by advanced 3D renderers into an image that samples information about the path each rendered pixel takes along the z axis extending outward from the virtual camera through space, including the color and opacity of every non-opaque surface or volume it passes through along the way, as well as neighboring samples. It might be considered somewhat analogous to the way ray tracing generates simulated photon paths through such mediums; however, ray tracing and other traditional rendering techniques generally produce images that contain only three or four channels of color and opacity values per pixel, flattened into a two dimensional frame. Depth maps, on the other hand, contain z axis information encoded in a grayscale image. Each level of gray represents a different slice of the z space. The "thickness" of each slice is determined at time of render, allowing for more or less depth fidelity depending on how deep the scene is. Depth maps have been a boon to compositors for blending 3D renders with live action and practical elements. To be useful, the map must have high enough bit depth to encode separation between close-to-camera objects and objects near infinity. Most 3D software packages are now capable of generating 16-bit and 32-bit depth maps, providing up to 2 billion depth levels. Depth maps do not however include transparency information about non-opaque surfaces or volumes and as such, objects beyond and viewed through these semi- or fully-transparent objects will have no depth information of their own and may not get composited or blurred correctly. Even the popular addition of cryptomattes to many post-production and VFX studios' pipelines, while providing separate color-coded ID shapes for individual elements in a rendered scene to further bridge the gap between CGI and compositing, don't allow for the nearly automated and fully non-linear workflows that deep data does. This is because deep images encapsulate enough 3D information that normally time-intensive tasks such as rotoscoping with numerous holdout mattes for complex interactions between moving characters and semi-transparent environmental volumes like smoke or water, are essentially trivial. Instead of going through that process, multiple mattes could easily be generated from a single set of deep images with no need to re-render every matte element and background for each case. In addition to that efficiency and flexibility, deep data images inherently provide much higher visual quality in common areas that have been difficult with traditional renders, such as the motion-blurred edges of characters with semi-transparent elements like hair. One downside to the use of deep images is their substantial file size, since they encode a relatively enormous amount of data per frame compared to even multichannel formats such as OpenEXR. === Function-based (integrated) === The data is stored as a function of depth. This results in a function curve that can be used to look up the data at any arbitrary depth. Manipulating the data is harder. === Sample-based (deintegrated) === Each sample is considered as an independent piece and can so be manipulated easily. To make sure the data is representing the right detail, an additional expand value needs to be introduced. == Generating deep data == 3D renderers produce the necessary data as a part of the rendering pipeline. Samples are gathered in depth and then combined. The deep data can be written out before this happens and so is nothing new to the process. Generating deep data from camera data needs a proper depth map. This is used in a couple of cases but still not accurate enough for detailed representation. For basic holdout task this can be sufficient though. == Compositing deep data images == Deep images can be composited like regular images. The depth component makes it easier to determine the layering order. Traditionally this had to be input by the user. Deep images have that information for themselves and need no user input. Edge artifacts are reduced as transparent pixels have more data to work with. == History == Deep Images have been around in 3D rendering packages for quite a while now. The use of them for holdouts was first done at several VFX houses in shaders. Holdout mattes can be generated at render time. Using them in a more interactive manner was started recently by several companies, SideFX integrated it in their Houdini software and facilities like Industrial Light & Magic, DreamWorks Animation, Weta, AnimalLogic and DRD studios have implemented interactive solutions. In 2014 the Academy of Motion Picture Arts and Sciences honored the technology with its annual SciTech awards. Dr. Peter Hillman for the long-term development and continued advancement of innovative, robust and complete toolsets for deep compositing and to Colin Doncaster, Johannes Saam, Areito Echevarria, Janne Kontkanen and Chris Cooper for the development, prototyping and promotion of technologies and workflows for deep compositing. == Resources == Pixar Paper Deep Image Paper Video tutorial of Deep Imaging as used on 2012 film Rise of the Planet of the Apes, Nuke compositing software Deep Compositing Course Deep Image File Format at Google Code Academy Award for the Technology Theory of Deep Pixels OpenEXR Deep Pixels

    Read more →
  • RemObjects Software

    RemObjects Software

    RemObjects Software is an American software company founded in 2002 by Alessandro Federici and Marc Hoffman. It develops and offers tools and libraries for software developers on a variety of development platforms, including Embarcadero Delphi, Microsoft .NET, Mono, and Apple's Xcode. == History == RemObjects Software was founded in the summer of 2002. Its first product was RemObjects SDK 1.0 for Delphi, the company's remoting solution which is now in its 6th version. In late 2003 RemObjects expanded its product portfolio to add Data Abstract for Delphi, a multi-tier database framework built on top of the SDK. In 2004, Carlo Kok, who would eventually become Chief Compiler Architect for Oxygene, joined the company, adding the open source Pascal Script library for Delphi to the company's portfolio. Initial development began on Oxygene (which was then named Chrome) based on Carlo's experience from writing the widely used Pascal Script scripting engine. Towards the end of 2004, RemObjects SDK for .NET was released, expanding the remoting framework to its second platform. Chrome 1.0 was released in mid-2005, providing support for .NET 1.1 and .NET 2.0, which was still in beta at the time - making Chrome the first shipping language for .NET that supported features such as generics. It was followed by Chrome 1.5 when .NET 2.0 shipped in November of the same year. 2005 also saw the expansion of Data Abstract to .NET as a second platform. Data Abstract for .NET was the first RemObjects product (besides Oxygene itself) to be written in Oxygene. Hydra 3.0, was released for .NET in December 2006, bringing a paradigm shift to the product, away from a regular plugin framework, and focusing on interoperability between plugins and host applications written in either .NET or Delphi/Win32, essentially enabling the use of both managed and unmanaged code in the same project. In Summer 2007, RemObjects released Chrome 'Joyride' which added official support for .NET 3.0 and 3.5. Chrome once again was the first language to ship release level support for new .NET framework features supported by that runtime - most importantly Sequences and Queries (aka LINQ). Development continued and in May 2008 Oxygene 3.0 was released, dropping the "Chrome" moniker. Oxygene once again brought major language enhancements, including extensive support for concurrency and parallel programming as part of the language syntax. In October 2008, RemObjects Software and Embarcadero Technologies announced plans to collaborate and ship future versions of Oxygene under the Delphi Prism moniker, later changed to Embarcadero Prism. The first of these releases of Prism became available in December 2008. Over the course of 2009, RemObjects software completed the expansion of its Data Abstract and RemObjects SDK product combo to a third development platform - Xcode and Cocoa, for both Mac OS X and iPhone SDK client development. RemObjects SDK for OS X shipped in the spring of 2009, followed by Data Abstract for OS X in the fall. In 2011, Oxygene was expanded to add support for the Java platform, in addition to NET. In 2014, RemObjects introduced a C# compiler which runs as a Visual Studio 2013 plugin, that can output code for iOS, MacOS (Cocoa) and Android, in addition to .NET compatible code. In addition, an IDE called Fire was introduced for macOS which works with their C# and Oxygene compilers. Together, the compiler supporting both Oxygene and C# was rebranded as the Elements Compiler, with CE# having the Code name "Hydrogene". In February 2015, RemObjects introduced a beta version of a Swift compiler called Silver as part of its Elements effort. Silver, too, could create code that will execute on Android, the JVM, .NET platform and also create native Cocoa code. Silver added new features to the Swift language, such as exceptions and has a few differences and limitations compared to Apple's Swift. In February 2020, support for the Go programming language was introduced with RemObjects Gold, including the ability to compile Go language code for all Elements platforms, and a port of the extensive Go Base Library available to all Elements languages. In 2021, Mercury was added to the Elements compiler as the sixth language, providing a future for the Visual Basic .NET language recently deprecated by Microsoft. Mercury supports building and maintaining existing VB.NET projects, as well as using the language for new projects both on .NET and the other platforms. == Commercial products == Elements is a development toolchain that targets .NET runtime, Java/Android virtual machines, the Apple ecosystem (macOS, iOS, tvOS), WebAssembly and native and Windows/Linux/Android NDK processor-native machine code in conjunction with a runtime library that does automatic garbage collection on non-ARC environments and ARC on ARC-based environments, such as iOS and MacOS. Because Java, C#, Swift, and Oxygene all can import each other's APIs, Elements effectively functions as Java bonded together with C# bonded together with Swift bonded together with Oxygene as a confederation of languages cooperating together quite intimately. Oxygene, a unique programming language based on Object Pascal, which can import Java, C#, and Swift APIs from the runtime of the target operating system; RemObjects C#, an implementation of C# programming language, which can import Java, Swift, and Oxygene APIs from the runtime of the target operating system and which is intended as a competitor of Xamarin, but Hydrogene's C# targets JVM bytecode instead of Xamarin's C# compiling to only Common Language Infrastructure byte code and needing the accompanying Mono Common Language Runtime to be present in such JVM-centric environments as Android; Silver, a free implementation of the Swift programming language, which can import Java, C#, and Oxygene APIs from the runtime of the target operating system; Iodine, an implementation of the Java programming language. Gold, an implementation of the Go programming language. Mercury, an implementation of the Visual Basic .NET programming language. Fire an integrated development environment for macOS. Water an integrated development environment for Windows. Data Abstract Remoting SDK, a.k.a. RemObjects SDK Hydra Oxfuscator Oxidizer, an automatic translator from Java, C#, Objective-C, and Delphi to Oxygene, from Java, Objective-C, and C# to Swift, and from Java and Objective-C to C#. == Open source projects == Train is an open-source JavaScript-based tool for building and running build scripts and automation. Internet Pack for .NET is a free, open source library for building network clients and servers using TCP and higher level protocols such as HTTP or FTP, using the .NET or Mono platforms. It includes a range of ready to use protocol implementations, as well as base classes that allow the creation of custom implementations. RemObjects Script for .NET is a fully managed ECMAScript implementation for .NET and Mono. Pascal Script for Delphi is a widely used implementation of Pascal as scripting language. == Involvement of other projects == The Oxygene Compiler Oxygene is a language based on Object Pascal and designed to efficiently target the Microsoft .NET and Mono managed runtimes; it expands Object Pascal with a range of additional language features, such as Aspect Oriented Programming, Class Contracts and support for Parallelism. It integrates with the Microsoft Visual Studio and MonoDevelop IDEs.

    Read more →
  • Morphobank

    Morphobank

    MorphoBank is a web application for collaborative evolutionary research, specifically phylogenetic systematics or cladistics, on the phenotype. Historically, scientists conducting research on phylogenetic systematics have worked individually or in small groups employing traditional single-user software applications such as MacClade, Mesquite and Nexus Data Editor. As the hypotheses under study have grown more complex, large research teams have assembled to tackle the problem of discovering the Tree of Life for the estimated 4-100 million living species(Wilson 2003, pp. 77–80) and the many thousands more extinct species known from fossils. Because the phenotype is fundamentally visual, and as phenotype-based phylogenetic studies have continued to increase in size, it becomes important that observations be backed up by labeled images. Traditional desktop software applications currently in wide use do not provide robust support for team-based research or for image manipulation and storage. MorphoBank is a particularly important tool for the growing scientific field of phenomics. The development of MorphoBank, which began in 2001, has been funded by the National Science Foundation's Directorates for Geosciences, Biological Sciences and Computer and Information Science and Engineering. The significance of the scientific work on MorphoBank has been featured in the New York Times(here and here), among other publications. == Advantages == Teams of scientists studying phylogenetics to build the Tree of Life assemble large spreadsheets of observations about species (referred to as "matrices"). These teams require simultaneous access by each team member to a single and secure copy of the team's data during a scientific research project. This single copy of the data also changes with great frequency during the data collection phase. Images that can be very helpful for documenting homology statements must be displayed, labeled and shared as homology statements develop. This cannot be accomplished elegantly with a desktop software package alone because in a desktop environment each collaborator is working on his own private copy of project data. Changes made by one participant cannot automatically propagate to others, preventing collaborators from seeing each other's data edits until they are manually (and due to the effort involved, often only periodically) merged into a single "true" dataset. In all but the smallest and most disciplined of teams, file version control and the reconciliation of changes made on multiple copies of the data emerge quickly as significant drags on productivity. MorphoBank is an attempt to address these issues by leveraging the ubiquity of the web and modern web-based application techniques, including Ajax, web service layers, and rich web applications to provide a full-featured, net-accessible collaborative workspace for phylogenetic research. In particular, MorphoBank makes it easy to: Share all kinds of data with geographically separated team members, including taxonomy, character and specimen data, media (including images, video and audio), phylogenetic matrices (including data in the widely used NEXUS and TNT format) and other data such as documents and genetic sequences. Label high-resolution images using a web-based image annotation application. Collaboratively edit project data such as phylogenetic matrices using a built-in web-based matrix editor. The editor allows the linking of labeled images to individual cells of a matrix. Manage access to project data. Access ranges from full-access for team members to anonymous read-only access for potential reviewers. Publish completed project data on the web in support of a published paper with a persistent URL. Search The Encyclopedia of Life for taxon exemplar images. Store high resolution CT data Create ontologies for updating and populating matrix cells. These tasks are difficult or impossible in most existing software applications. == History == In 2001 the National Science Foundation (NSF) sponsored a workshop, at the American Museum of Natural History in New York to develop the outlines of a web-based system for a collaborative, media-rich research tool for morphological phylogenetics. An application prototype presented at the workshop was later refined with feedback from the workshop and became MorphoBank version 1.0. A grant from the US National Oceanic and Atmospheric Administration funded further revisions resulting in version 2.0, released in 2005. Current support from the NSF is funding current feature enhancements to MorphoBank. MorphoBank was hosted by Stony Brook University until late October 2021 and received back up support from the American Museum of Natural History. The current version is 3.0. Rationale for the software was described in the journal Cladistics. MorphoBank has also received support from NESCENT and the San Diego Supercomputer Center. Since 2018, MorphoBank has been supported in part by Phoenix Bioinformatics, a non-profit company founded to sustain databases for the basic sciences. A permanent move of MorphoBank from Stony Brook University to Phoenix Bioinformatics was complete in late October 2021. The San Diego Supercomputer Center has previously provided technical and hosting resources to the MorphoBank project. == Usage == MorphoBank hosts the products of peer-reviewed scientific research on phenotypes. An increasing volume of systematics data is "born digital" and MorphoBank is well suited to handle this type of material. On August 24, 2007, 62 active research projects were hosted by MorphoBank, as well as 6 completed (and published) projects. By 2017 over 2000 scientists and their students were registered content builders (users are not required to register and are even more numerous) and has more than 500 publicly available projects with approximately 80,000 images that are the products of scientific research. Over 1,500 active research projects are hosted by MorphoBank. The software has been used to assemble phylogenetic research on such groups as mammals, from bats to whales, bivalve molluscs, arachnids, fossil plants and living and extinct amniotes. It has also been used more broadly in evolutionary and paleontological research to host curated images associated with published research on lacewing insects geckos, raptor birds, dinosaurs, frogs and nematodes. MorphoBank is increasingly used in conjunction with the Paleobiology Database. Example published projects: Project 1097: Blank CE, 2013 Origin and early evolution of photosynthetic eukaryotes in freshwater environments – reinterpreting proterozoic paleobiology and biogeochemical processes in light of trait evolution Project 2520: Carvalho, T. P., R. E. Reis, and J. P. Friel, 2017 A new species of Hoplomyzon (Siluriformes: Aspredinidae) from Maracaibo Basin, Venezuela: osteological description using high-resolution Project 2651: Baron, M. G., Norman, D. B., Barrett, P. M., 2017 A new hypothesis of dinosaur relationships and early dinosaur evolution MorphoBank has been particularly important to the Assembling the Tree of Life initiative sponsored by the National Science Foundation. MorphoBank is well-suited to such projects because of its tools for merging taxonomic, character and matrix-based data, as well as its collaborative features. Highlights of this research include a collaborative matrix on mammal evolution published in Science that included over 4,000 phenomic characters scored for over 80 species, a matrix on extant baleen whales featuring nearly 600 images, and more.

    Read more →
  • Mentimeter

    Mentimeter

    Mentimeter (or Menti for short) is a Swedish company based in Stockholm that develops and maintains an eponymous app used to create presentations with real-time feedback. == Foundation and background == Based in Stockholm, Sweden, the Mentimeter app was started by Swedish entrepreneur Johnny Warström and Niklas Ingvar as a response to unproductive meetings. The initial start-up budget was $500,000 raised by a group of prominent investors, including Per Appelgren in 2014, following the market's tendency to invest in Scandinavia. The app also focuses on online collaboration for the education sector, allowing students or public members to answer questions anonymously. The app enables users to share knowledge and real-time feedback on mobile devices with presentations, polls or brainstorming sessions in classes, meetings, gatherings, conferences and other group activities. == Achievements == By 2021, Mentimeter had over 270 million users and was one of Sweden's fastest-growing startups. The company also ranked #10 on 20 Fastest Growing 500 Startups Batch 16 Companies. It was ranked Stockholm's fastest growing company of the 2018 edition of the DI Gasell Award. Mentimeter has a freemium business model.

    Read more →