AI Face Combiner

AI Face Combiner — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Desktop video

    Desktop video

    Desktop video refers to a phenomenon lasting from the mid-1980s to the early 1990s when the graphics capabilities of personal computers such as the Amiga, Macintosh II, and specially-upgraded IBM PC compatibles had advanced to the point where individuals and local broadcasters could use them for analog non-linear editing and vision mixing in video production. Despite the use of computers, desktop video should not be confused with digital video since the video data remained analog, and it uses items like a VCR and a camcorder to record the video. Full-screen, full-motion video's vast storage requirements meant that the promise of digital encoding would not be realized on desktop computers for at least another decade. == Description == There were multiple models of genlock cards available to synchronize the content; the Newtek Video Toaster was commonly used in Amiga in countries that used NTSC (PAL-M in Brazil), while PCs had Truevision and Matrox Illuminator cards and Mac systems had the SuperMac Video Spigot and Radius VideoVision cards. Apple later introduced the Macintosh Quadra 840AV and Centris 660AV systems to specifically address this market. Desktop video was a parallel development to desktop publishing and enabled many small production houses and local TV stations to produce their own original content for the first time. Along with the advent of public-access cable channels, desktop video meant that television advertising became affordable for local businesses such as retailers, restaurants, real estate agents, contractors and auto dealers. As with the phrase desktop publishing, use of the term died out as the technologies to which it referred become the norm for any kind of video production.

    Read more →
  • TRAME

    TRAME

    TRAME (TRAnsmission of MEssages) was the name of the second computer network in the world similar to the internet to be used in an electric utility. Like the internet, the base technology was packet switching; it was developed by the electric utility ENHER in Barcelona. It was deployed by the same utility, first in Catalonia and Aragón, Spain, and later in other places. Its development started in 1974 and the first routers, called nodes at that time, were deployed by 1978. The network was in operation until 2016 (38 years) with successive technological software and hardware updates. == Beginnings == In 1974, packet switching was a technology known only in research circles. The concept began in 1968 in association with the United States' Advanced Research Projects Agency (ARPA) research project ARPANET. The idea of applying the packet switching concept to electric utilities control communication networks first appeared in 1974 when the Swedish power utility Vattenfall started to create its TIDAS packet-switching network and was followed by the Spanish electric utility ENHER, which aimed to telecontrol and automate its high-voltage power grid. For this purpose, ENHER created a specific team of people to develop both the packet-switching network and the supervisory control and data acquisition (SCADA) system, also called the telecontrol system. By 1978 the first four TRAME routers were available and by 1980, eight of them were deployed and operating. The printed circuit boards (PCBs) controlling the communication lines were connected to a shared memory PCB allowing them to exchange data and messages. The project was developed together with its main initial application, the Telecontrol or SCADA system SICL (Sistema Integral de Control Local) with which initially they shared a very similar hardware. The maximum link capacity was 9600 bit/s, which in 1980 was the maximum possible on a 4 kHz wide voice channel at the time. These channels were the basic unit of the then-analog communication systems in use. By that time power utilities used either telephone calls or low speed (below 1200bit/s) dedicated links for telecontrol, typically shared among ten high-voltage electrical substations. == Services == The basic service provided by the TRAME network was SCADA or Telecontrol to automate the high-voltage power grid, thus improving operational efficiency, which was until then operated manually with telephone communication between human operators. Each TRAME router was associated with one or more remote terminal units (RTUs) of the SICL telecontrol system. It also had connected screens, and later PCs, located in electrical substations to interchange messages between them and with the Control Center located in the well-known Casa Fuster in Barcelona. It was a kind of predecessor to today's e-mail. Later, in the 1990s, other protocols (X.25, IP) were developed to include corporate information technology (IT) terminals, company physical surveillance systems and other services. Additionally, applications and terminals were developed for the transmission of voice and video over the TRAME network. == Protocols == The TRAME routing system, like that of the original ARPANET, was based on the Bellman-Ford algorithm but with "split-horizon" as in the Swedish TIDAS network, but with an original improvement. This protocol allows optimal paths to be found in meshed networks for each packet to be transmitted, allowing the shared use of the same network by multiple services. In contrast, traditional circuit-switched technology used to establish dedicated circuits for each service or communication. The addressing of routers and terminals used a proprietary system with a 16-bit address; it would be the equivalent of the well-known IP (Internet Protocol) version 4 (IPv4), still in use on the internet today, which uses 32-bit addresses. It is necessary to take into account that in 1978, the IPv4 protocol did not yet exist since the IPv4 version used on the internet did not appear until 1981, and in fact, did not reach the general public until much later. The line protocols were also proprietary and were called UCL (Unidad de Control de Línea, 'line control unit'), which linked the routers together, and UTR (Unión TRAME-Remotas), the access protocol. They were designed to offer the highest quality of service required by the telecontrol/SCADA function in terms of data integrity and availability set by the International Electrotechnical Commission (IEC) IEC-870-5-1 and ANSI C37.1. standards, and because the protocol used at the time in corporate computer networks, HDLC (high-level data link control), did not offer enough quality for critical industrial applications. Later on, other protocols like X.25 and IP were also made compatible with the aforementioned TRAME protocols. In 2000, the UTR protocol was replaced by the international standard IEC 60870- 5-101/104. Initially network flow control was based on the management of eight data priorities in head-of-the-line (HOL) waiting queues. Later and after some experimentation, a flow control method based on a bit indicating route congestion and management of the gap between packets when accessing the network was adopted. This required measuring the capacity of the route bottleneck. An end-to-end protocol was also added for some flows requiring order preservation like X.25. == Evolution == To last for 38 years, the technology had to endure intense evolution. There were essentially four TRAME generations which are summarized in the table. A description of the four generations of TRAME is provided below. === TRAME 1 === The project began in 1974 and in 1978 a first network with four routers was already installed and in operation at the electric utility ENHER. In 1980, the network had eight nodes in operation (see Figure I). The hardware was based on the Zilog Z80 processor and had a multiprocessor structure with 16 processors sharing a common memory. The software was developed at ENHER's headquarters located in the well-known Casa Fuster, Passeig de Gràcia, 132, Barcelona, using the Z80 assembly language. Beyond 1980 the software began to be written in C programming language and an HP64000 Logic Development System emulator was used for the purpose. The hardware was produced by ISEL, an INI (Instituto Nacional de Indústria) company. The routing system was a variant of Bellman-Ford with split-horizon. It was an improvement of the original ARPA network routing system consisting of an original update procedure which allowed for a faster reaction to changes. The distance function was the number of packets in the output waiting queues plus one. The line protocols (UCL for internal lines linking routers and UTR for accessing the network) were designed to meet the stringent requirements set for telecontrol (SCADA) of high-voltage power networks (IEC-870-5-1 and ANSI C37.1 standards). At the OSI transport layer, windows with a width of 1 to 8, depending on the required service, residing in the terminals were used. Initially, addresses were only 14 bits long to address both the routers (called nodes by then) and the devices connected to them. They were made up of two fields, an 8-bit field to address the router and a 6-bit sub-address to address the terminals connected to it. The node address was assigned to the nodes and not to the ends of the links as in the internet. The basic advantages of TRAME over other technologies used in electric utilities at the time were in part due to the packet technology itself: ability to manage any network topology, automatic adaptability to topological and traffic changes, integration of different link technologies (digital or analog) and capacities in a single network, open and decentralized intercommunicability between users and devices, simultaneous communication with several users and locations from a single physical connection, and integrated network supervision. In fact, the network was provided from its inception with a supervision center consisting of a computer and a synoptic board located at the company's headquarters (see Figure II). But other advantages were due to the specific design of TRAME: high data integrity, priority support for packets, and ease of including special protocols such as the many SCADA protocols in use at that time. All of the above resulted in improved quality of service, especially with respect to data availability and data integrity, and in the integration of services in a single network. Part of the evolution of its deployment can be seen in Figures II to IV. === TRAME 2 === In 1990, TRAME 2 was fully deployed and TRAME 1 was replaced. The processor of the new hardware was Intel 80286 and the hardware structure and external appearance of the routers was very similar to that of TRAME 1. The software was written in C and the above-mentioned emulator continued to be used. Improvements over TRAME 1 were the introduction of the standardized X.25 access protocol

    Read more →
  • Electronic lab notebook

    Electronic lab notebook

    An electronic lab notebook or electronic laboratory notebook (ELN) is a computer program designed to replace paper laboratory notebooks. Lab notebooks in general are used by scientists, engineers, and technicians to document research, experiments, and procedures performed in a laboratory. A lab notebook is often maintained to be a legal document and may be used in a court of law as evidence. Similar to an inventor's notebook, the lab notebook is also often referred to in patent prosecution and intellectual property litigation. Electronic lab notebooks offer many benefits to the user as well as organizations; they are easier to search upon, simplify data copying and backups, and support collaboration amongst many users. ELNs can have fine-grained access controls, and can be more secure than their paper counterparts. They also allow the direct incorporation of data from instruments, replacing the practice of printing out data to be stapled into a paper notebook. == Types == ELNs can be divided into two categories: "Specific ELNs" contain features designed to work with specific applications, scientific instrumentation or data types. "Cross-disciplinary ELNs" or "Generic ELNs" are designed to support access to all data and information that needs to be recorded in a lab notebook. Lab Platforms that combine an ELN, LIMS, and scientific data management together, all-in-one configurable software environment. Solutions range from specialized programs designed from the ground up for use as an ELN, to modifications or direct use of more general programs. Examples of using more general software as an ELN include using OpenWetWare, a MediaWiki install (running the same software that Wikipedia uses), WordPress, or the use of general note taking software such as OneNote as an ELN. ELN's come in many different forms. They can be standalone programs, use a client-server model, or be entirely web-based. Some use a lab-notebook approach, others resemble a blog. ELNs are embracing artificial intelligence and LLM technology to provide scientific AI chat assistants. A good many variations on the "ELN" acronym have appeared. Differences between systems with different names are often subtle, with considerable functional overlap between them. Examples include "ERN" (Electronic Research Notebook), "ERMS" (Electronic Resource (or Research or Records) Management System (or Software) and SDMS (Scientific Data (or Document) Management System (or Software). Ultimately, these types of systems all strive to do the same thing: Capture, record, centralize and protect scientific data in a way that is highly searchable, historically accurate, and legally stringent, and which also promotes secure collaboration, greater efficiency, reduced mistakes and lowered total research costs. == Objectives == A good electronic laboratory notebook should offer a secure environment to protect the integrity of both data and process, whilst also affording the flexibility to adopt new processes or changes to existing processes without recourse to further software development. The package architecture should be a modular design, so as to offer the benefit of minimizing validation costs of any subsequent changes that you may wish to make in the future as your needs change. A good electronic laboratory notebook should be an "out of the box" solution that, as standard, has fully configurable forms to comply with the requirements of regulated analytical groups through to a sophisticated ELN for inclusion of structures, spectra, chromatograms, pictures, text, etc. where a preconfigured form is less appropriate. All data within the system may be stored in a database (e.g. MySQL, MS-SQL, Oracle) and be fully searchable. The system should enable data to be collected, stored and retrieved through any combination of forms or ELN that best meets the requirements of the user. The application should enable secure forms to be generated that accept laboratory data input via PCs and/or laptops / palmtops, and should be directly linked to electronic devices such as laboratory balances, pH meters, etc. Networked or wireless communications should be accommodated for by the package which will allow data to be interrogated, tabulated, checked, approved, stored and archived to comply with the latest regulatory guidance and legislation. A system should also include a scheduling option for routine procedures such as equipment qualification and study related timelines. It should include configurable qualification requirements to automatically verify that instruments have been cleaned and calibrated within a specified time period, that reagents have been quality-checked and have not expired, and that workers are trained and authorized to use the equipment and perform the procedures. == Regulatory and legal aspects == The laboratory accreditation criteria found in the ISO 17025 standard needs to be considered for the protection and computer backup of electronic records. These criteria can be found specifically in clause 4.13.1.4 of the standard. Electronic lab notebooks used for development or research in regulated industries, such as medical devices or pharmaceuticals, are expected to comply with FDA regulations related to software validation. The purpose of the regulations is to ensure the integrity of the entries in terms of time, authorship, and content. Unlike ELNs for patent protection, FDA is not concerned with patent interference proceedings, but is concerned with avoidance of falsification. Typical provisions related to software validation are included in the medical device regulations at 21 CFR 820 (et seq.) and Title 21 CFR Part 11. Essentially, the requirements are that the software has been designed and implemented to be suitable for its intended purposes. Evidence to show that this is the case is often provided by a Software Requirements Specification (SRS) setting forth the intended uses and the needs that the ELN will meet; one or more testing protocols that, when followed, demonstrate that the ELN meets the requirements of the specification and that the requirements are satisfied under worst-case conditions. Security, audit trails, prevention of unauthorized changes without substantial collusion of otherwise independent personnel (i.e., those having no interest in the content of the ELN such as independent quality unit personnel) and similar tests are fundamental. Finally, one or more reports demonstrating the results of the testing in accordance with the predefined protocols are required prior to release of the ELN software for use. If the reports show that the software failed to satisfy any of the SRS requirements, then corrective and preventive action ("CAPA") must be undertaken and documented. Such CAPA may extend to minor software revisions, or changes in architecture or major revisions. CAPA activities need to be documented as well. Aside from the requirements to follow such steps for regulated industry, such an approach is generally a good practice in terms of development and release of any software to assure its quality and fitness for use. There are standards related to software development and testing that can be applied (see ref.).

    Read more →
  • ACTS Gigabit Satellite Network

    ACTS Gigabit Satellite Network

    The ACTS Gigabit Satellite Network was a pioneering, high-speed communications satellite network in the years 1993-2004, created as a prototype system to explore high-speed networking of digital endpoints. The system was jointly sponsored by NASA and ARPA, implemented by BBN Technologies and Motorola, and was inducted into the Space Technology Hall of Fame in April 1997. The Advanced Communications Technology Satellite (ACTS) network was designed to provide fiber-compatible SONET service to remote nodes and networks through a wideband satellite system, and provided long-haul, point-to-point and point-to-multipoint full-duplex SONET services, at rates up to 622 Mbit/s, over NASA's Advanced Communication Technology Satellite (ACTS). The Advanced Communications Technology Satellite itself, built and operated by Lockheed Martin, was launched on STS-51 on September 12, 1993, by the Space Shuttle Discovery, and occupied a geostationary orbit at 100° west longitude. It was the first communication satellite to operate in the 20–30 GHz frequency band (Ka band), with 30 GHz uplink and 20 GHz downlink signals. The satellite incorporated advanced on-board switching and multiple dynamically-hopping spot-beam antennas for selected areas of the United States including Hawaii. Up to 3 uplink and 3 downlink antenna beams could be active simultaneously. The ACTS network ground terminals were transportable Gigabit Earth Stations (GES) with fiber-optic SONET interfaces (OC-3 and OC-12), which also supported the Asynchronous Transfer Mode (ATM) protocol suite. The network control and management functions are distributed in the various Gigabit Earth Stations, with the operator's interface being centralized in a Network Management Terminal (NMT), which could be collocated at a GES, or anywhere in the Internet. The system was operational and used for experiments for 127 months, instead of the originally planned 24–48 months. In all, 53 terminals were built and used by more than 100 experimenters to test ACTS abilities. In Nov. 1997 a record data rate of 520 Mbit/s TCP/IP throughput was achieved using ATM between several ground stations via ACTS. On May 31, 2000 the ACTS experiments program officially came to a close, but the system continued to support experiments until it was deactivated on April 28, 2004.

    Read more →
  • Single particle analysis

    Single particle analysis

    Single particle analysis is a group of related computerized image processing techniques used to analyze images from transmission electron microscopy (TEM). These methods were developed to improve and extend the information obtainable from TEM images of particulate samples, typically proteins or other large biological entities such as viruses. Individual images of stained or unstained particles are very noisy, making interpretation difficult. Combining several digitized images of similar particles together gives an image with stronger and more easily interpretable features. An extension of this technique uses single particle methods to build up a three-dimensional reconstruction of the particle. Using cryo-electron microscopy it has become possible to generate reconstructions with sub-nanometer, near-atomic resolution resolution first in the case of highly symmetric viruses, and now in smaller, asymmetric proteins as well. == Techniques == Single particle analysis can be done on both negatively stained and vitreous ice-embedded transmission electron cryomicroscopy (CryoTEM) samples. Single particle analysis methods are, in general, reliant on the sample being homogeneous, although techniques for dealing with conformational heterogeneity are being developed. Images (micrographs) are taken with an electron microscope using charged-coupled device (CCD) detectors coupled to a phosphorescent layer (in the past, they were instead collected on film and digitized using high-quality scanners). The image processing is carried out using specialized software programs, often run on multi-processor computer clusters. Depending on the sample or the desired results, various steps of two- or three-dimensional processing can be done. === Alignment and classification === Biological samples, and especially samples embedded in thin vitreous ice, are highly radiation sensitive, thus only low electron doses can be used to image the sample. This low dose, as well as variations in the metal stain used (if used) means images have high noise relative to the signal given by the particle being observed. By aligning several similar images to each other so they are in register and then averaging them, an image with higher signal-to-noise ratio can be obtained. As the noise is mostly randomly distributed and the underlying image features constant, by averaging the intensity of each pixel over several images only the constant features are reinforced. Typically, the optimal alignment (a translation and an in-plane rotation) to map one image onto another is calculated by cross-correlation. However, a micrograph often contains particles in multiple different orientations and/or conformations, and so to get more representative image averages, a method is required to group similar particle images together into multiple sets. This is normally carried out using one of several data analysis and image classification algorithms, such as multi-variate statistical analysis and hierarchical ascendant classification, or k-means clustering. Often data sets of tens of thousands of particle images are used, and to reach an optimal solution an iterative procedure of alignment and classification is used, whereby strong image averages produced by classification are used as reference images for a subsequent alignment of the whole data set. === Image filtering === Image filtering (band-pass filtering) is often used to reduce the influence of high and/or low spatial frequency information in the images, which can affect the results of the alignment and classification procedures. This is particularly useful in negative stain images. The algorithms make use of fast Fourier transforms (FFT), often employing Gaussian shaped soft-edged masks in reciprocal space to suppress certain frequency ranges. High-pass filters remove low spatial frequencies (such as ramp or gradient effects), leaving the higher frequencies intact. Low-pass filters remove high spatial frequency features and have a blurring effect on fine details. === Contrast transfer function === Due to the nature of image formation in the electron microscope, bright-field TEM images are obtained using significant underfocus. This, along with features inherent in the microscope's lens system, creates blurring of the collected images visible as a point spread function. The combined effects of the imaging conditions are known as the contrast transfer function (CTF), and can be approximated mathematically as a function in reciprocal space. Specialized image processing techniques such as phase flipping and amplitude correction / Wiener filtering can (at least partially) correct for the CTF, and allow high resolution reconstructions. === Three-dimensional reconstruction === Transmission electron microscopy images are projections of the object showing the distribution of density through the object, similar to medical X-rays. By making use of the projection-slice theorem a three-dimensional reconstruction of the object can be generated by combining many images (2D projections) of the object taken from a range of viewing angles. Proteins in vitreous ice ideally adopt a random distribution of orientations (or viewing angles), allowing a fairly isotropic reconstruction if a large number of particle images are used. This contrasts with electron tomography, where the viewing angles are limited due to the geometry of the sample/imaging set up, giving an anisotropic reconstruction. Filtered back projection is a commonly used method of generating 3D reconstructions in single particle analysis, although many alternative algorithms exist. Before a reconstruction can be made, the orientation of the object in each image needs to be estimated. Several methods have been developed to work out the relative Euler angles of each image. Some are based on common lines (common 1D projections and sinograms), others use iterative projection matching algorithms. The latter works by beginning with a simple, low resolution 3D starting model and compares the experimental images to projections of the model and creates a new 3D to bootstrap towards a solution. Methods are also available for making 3D reconstructions of helical samples (such as tobacco mosaic virus), taking advantage of the inherent helical symmetry. Both real space methods (treating sections of the helix as single particles) and reciprocal space methods (using diffraction patterns) can be used for these samples. === Tilt methods === The specimen stage of the microscope can be tilted (typically along a single axis), allowing the single particle technique known as random conical tilt. An area of the specimen is imaged at both zero and at high angle (~60-70 degrees) tilts, or in the case of the related method of orthogonal tilt reconstruction, +45 and −45 degrees. Pairs of particles corresponding to the same object at two different tilts (tilt pairs) are selected, and by following the parameters used in subsequent alignment and classification steps a three-dimensional reconstruction can be generated relatively easily. This is because the viewing angle (defined as three Euler angles) of each particle is known from the tilt geometry. 3D reconstructions from random conical tilt suffer from missing information resulting from a restricted range of orientations. Known as the missing cone (due to the shape in reciprocal space), this causes distortions in the 3D maps. However, the missing cone problem can often be overcome by combining several tilt reconstructions. Tilt methods are best suited to negatively stained samples, and can be used for particles that adsorb to the carbon support film in preferred orientations. The phenomenon known as charging or beam-induced movement makes collecting high-tilt images of samples in vitreous ice challenging. === Map visualization and fitting === Various software programs are available that allow viewing the 3D maps. These often enable the user to manually dock in protein coordinates (structures from X-ray crystallography, NMR, or a computational model such as one found in the AlphaFold Protein Structure Database) of subunits into the electron density. Several programs can also fit subunits computationally; as of the 2020s using these programs tend to produce better accuracy than manual docking because they can perform labor-intensive tasks such as: The scale of SPA-derived maps depends on knowing the pixel size (angstorms per pixel), which is not always accurate. Programs can automatically correct for this difference by using coordinate data or by using knowledge of chemical bonds. Many proteins are made up of several roughly rigid protein domains linked by flexible parts. Pre-existing coordinate data, whether experimental or computational, may not exactly match the inter-domain positioning of the cyro-EM map. Modern programs can automatically "chop" pre-existing coordinate data into individual domains and fit them in individually. For higher-resolution structures, it is pos

    Read more →
  • Interplanetary Internet

    Interplanetary Internet

    The interplanetary Internet is a conceived computer network in space, consisting of a set of network nodes that can communicate with each other. These nodes are the planet's orbiters and landers, and the Earth ground stations. For example, the orbiters collect the scientific data from the Curiosity rover on Mars through near-Mars communication links, transmit the data to Earth through direct links from the Mars orbiters to the Earth ground stations via the NASA Deep Space Network, and finally the data routed through Earth's internal internet. Interplanetary communication is greatly delayed by interplanetary distances, as data transmission can only go as fast as the speed of light, so a new set of protocols and technologies that are tolerant to large delays and errors are required. The interplanetary Internet has been envisioned as a store and forward network of internets that is often disconnected, has a wireless backbone fraught with error-prone links and delays ranging from tens of minutes to even hours, even when there is a connection. As of 2024 agencies and companies working towards bringing the network to fruition include NASA, ESA, SpaceX and Blue Origin. == Challenges and reasons == In the core implementation of Interplanetary Internet, satellites orbiting a planet communicate to other planet's satellites. Simultaneously, these planets revolve around the Sun with long distances, and thus many challenges face the communications. The reasons and the resultant challenges are: The motion and long distances between planets: The interplanetary communication is greatly delayed due to the interplanetary distances and the motion of the planets. The delay is variable and long, ranging from a couple of minutes (Earth-to-Mars), to a couple of hours (Pluto-to-Earth), depending on their relative positions. The interplanetary communication also suspends due to the solar conjunction, when the sun's radiation hinders the direct communication between the planets. As such, the communication characterizes lossy links and intermittent link connectivity. Low embeddable payload: Satellites can only carry a small payload, which poses challenges to the power, mass, size, and cost for communication hardware design. An asymmetric bandwidth would be the result of this limitation. This asymmetry reaches ratios up to 1000:1 as downlink:uplink bandwidth portion. Absence of fixed infrastructure: The graph of participating nodes in a specific planet-to-planet communication keeps changing over time, due to the constant motion. The routes of the planet-to-planet communication are planned and scheduled rather than being opportunistic. The Interplanetary Internet design must address these challenges to operate successfully and achieve good communication with other planets. It also must use the few available resources efficiently in the system. == Development == Space communication technology has steadily evolved from expensive, one-of-a-kind point-to-point architectures, to the re-use of technology on successive missions, to the development of standard protocols agreed upon by space agencies of many countries. This last phase has gone on since 1982 through the efforts of the Consultative Committee for Space Data Systems (CCSDS), a body composed of the major space agencies of the world. It has 11 member agencies, 32 observer agencies, and over 119 industrial associates. The evolution of space data system standards has gone on in parallel with the evolution of the Internet, with conceptual cross-pollination where fruitful, but largely as a separate evolution. Since the late 1990s, familiar Internet protocols and CCSDS space link protocols have integrated and converged in several ways; for example, the successful FTP file transfer to Earth-orbiting STRV 1B on January 2, 1996, which ran FTP over the CCSDS IPv4-like Space Communications Protocol Specifications (SCPS) protocols. Internet Protocol use without CCSDS has taken place on spacecraft, e.g., demonstrations on the UoSAT-12 satellite, and operationally on the Disaster Monitoring Constellation. Having reached the era where networking and IP on board spacecraft have been shown to be feasible and reliable, a forward-looking study of the bigger picture was the next phase. The Interplanetary Internet study at NASA's Jet Propulsion Laboratory (JPL) was started by a team of scientists at JPL led by internet pioneer Vinton Cerf and the late Adrian Hooke. Cerf was appointed as a distinguished visiting scientist at JPL in 1998, while Hooke was one of the founders and directors of CCSDS. While IP-like SCPS protocols are feasible for short hops, such as ground station to orbiter, rover to lander, lander to orbiter, probe to flyby, and so on, delay-tolerant networking is needed to get information from one region of the Solar System to another. It becomes apparent that the concept of a region is a natural architectural factoring of the Interplanetary Internet. A region is an area where the characteristics of communication are the same. Region characteristics include communications, security, the maintenance of resources, perhaps ownership, and other factors. The Interplanetary Internet is a "network of regional internets". What is needed then, is a standard way to achieve end-to-end communication through multiple regions in a disconnected, variable-delay environment using a generalized suite of protocols. Examples of regions might include the terrestrial Internet as a region, a region on the surface of the Moon or Mars, or a ground-to-orbit region. The recognition of this requirement led to the concept of a "bundle" as a high-level way to address the generalized Store-and-Forward problem. Bundles are an area of new protocol development in the upper layers of the OSI model, above the Transport Layer with the goal of addressing the issue of bundling store-and-forward information so that it can reliably traverse radically dissimilar environments constituting a "network of regional internets". Delay-tolerant networking (DTN) was designed to enable standardized communications over long distances and through time delays. At its core is the Bundle Protocol (BP), which is similar to the Internet Protocol, or IP, that serves as the heart of the Internet here on Earth. The big difference between the regular Internet Protocol (IP) and the Bundle Protocol is that IP assumes a seamless end-to-end data path, while BP is built to account for errors and disconnections — glitches that commonly plague deep-space communications. Bundle Service Layering, implemented as the Bundling protocol suite for delay-tolerant networking, will provide general-purpose delay-tolerant protocol services in support of a range of applications: custody transfer, segmentation and reassembly, end-to-end reliability, end-to-end security, and end-to-end routing among them. The Bundle Protocol was first tested in space on the UK-DMC satellite in 2008. An example of one of these end-to-end applications flown on a space mission is the CCSDS File Delivery Protocol (CFDP), used on the Deep Impact comet mission. CFDP is an international standard for automatic, reliable file transfer in both directions. CFDP should not be confused with Coherent File Distribution Protocol, which has the same acronym and is an IETF-documented experimental protocol for rapidly deploying files to multiple targets in a highly networked environment. In addition to reliably copying a file from one entity (such as a spacecraft or ground station) to another entity, CFDP has the capability to reliably transmit arbitrarily small messages defined by the user, in the metadata accompanying the file, and to reliably transmit commands relating to file system management that are to be executed automatically on the remote end-point entity (such as a spacecraft) upon successful reception of a file. == Protocol == The Consultative Committee for Space Data Systems (CCSDS) packet telemetry standard defines the protocol used for the transmission of spacecraft instrument data over the deep-space channel. Under this standard, an image or other data sent from a spacecraft instrument is transmitted using one or more packets. === CCSDS packet definition === A packet is a block of data with length that can vary between successive packets, ranging from 7 to 65,542 bytes, including the packet header. Packetized data is transmitted via frames, which are fixed-length data blocks. The size of a frame, including frame header and control information, can range up to 2048 bytes. Packet sizes are fixed during the development phase. Because packet lengths are variable but frame lengths are fixed, packet boundaries usually do not coincide with frame boundaries. === Telecom processing notes === Data in a frame is typically protected from channel errors by error-correcting codes. Even when the channel errors exceed the correction capability of the error-correcting code, the presence of errors is nearly always detected by the e

    Read more →
  • Data analysis

    Data analysis

    Data analysis is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays an important role in making decisions more scientific and helping businesses operate more effectively. It is widely used in fields such as business analytics, healthcare, and artificial intelligence to extract meaningful insights from data. Data mining is a particular data analysis technique that focuses on statistical modeling and knowledge discovery for predictive rather than purely descriptive purposes, while business intelligence covers data analysis that relies heavily on aggregation, focusing mainly on business information. In statistical applications, data analysis can be divided into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). EDA focuses on discovering new features in the data, while CDA focuses on confirming or falsifying existing hypotheses. Predictive analytics focuses on the application of statistical models for predictive forecasting or classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a variety of unstructured data. All of the above are varieties of data analysis. == Data analysis process == Data analysis is a process for obtaining raw data, and subsequently converting it into information useful for decision-making by users. Statistician John Tukey, defined data analysis in 1961, as:"Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data." There are several phases, and they are iterative, in that feedback from later phases may result in additional work in earlier phases. === Data requirements === The data is necessary as inputs to the analysis, which is specified based upon the requirements of those directing the analytics (or customers, who will use the finished product of the analysis). The general type of entity upon which the data will be collected is referred to as an experimental unit (e.g., a person or population of people). Specific variables regarding a population (e.g., age and income) may be specified and obtained. Data may be numerical or categorical (i.e., a text label for numbers). === Data collection === Data may be collected from a variety of sources. A list of data sources are available for study & research. The requirements may be communicated by analysts to custodians of the data; such as, Information Technology personnel within an organization. Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. The data may also be collected from sensors in the environment, including traffic cameras, satellites, recording devices, etc. It may also be obtained through interviews, downloads from online sources, or reading documentation. === Data processing === Data integration is a precursor to data analysis: Data, when initially obtained, must be processed or organized for analysis. For instance, this may involve placing data into rows and columns in a table format (known as structured data) for further analysis, often through the use of spreadsheet (e.g. Excel) or statistical software. === Data cleaning === Once processed and organized, the data may be incomplete, contain duplicates, or contain errors. The need for data cleaning will arise from problems in the way that the data is entered and stored. Data cleaning is the process of preventing and correcting these errors. Common tasks include record matching, identifying inaccuracy of data, overall quality of existing data, deduplication, and column segmentation. Such data problems can also be identified through a variety of analytical techniques. For example; with financial information, the totals for particular variables may be compared against separately published numbers that are believed to be reliable. Unusual amounts, above or below predetermined thresholds, may also be reviewed. There are several types of data cleaning that are dependent upon the type of data in the set; this could be phone numbers, email addresses, employers, or other values. Quantitative data methods for outlier detection can be used to get rid of data that appears to have a higher likelihood of being input incorrectly. Text data spell checkers can be used to lessen the amount of mistyped words. However, it is harder to tell if the words are contextually (i.e., semantically and idiomatically) correct. === Exploratory data analysis === Once the datasets are cleaned, they can then begin to be analyzed using exploratory data analysis. The process of data exploration may result in additional data cleaning or additional requests for data; thus, the initialization of the iterative phases mentioned above. Descriptive statistics, such as the average, median, and standard deviation, are often used to broadly characterize the data. Data visualization is also used, in which the analyst is able to examine the data in a graphical format in order to obtain additional insights about messages within the data. === Modeling and algorithms === Mathematical formulas or mathematical models (supported by algorithms) may be applied to the data in order to identify relationships among the variables; for example, checking for correlation and by determining whether or not there is the presence of causality. In general terms, models may be developed to evaluate a specific variable based on other variable(s) contained within the dataset, with some residual error depending on the implemented model's accuracy (e.g., Data = Model + Error). Inferential statistics utilizes techniques that measure the relationships between particular variables. For example, regression analysis may be used to model whether a change in advertising (independent variable X), provides an explanation for the variation in sales (dependent variable Y), i.e. is Y a function of X? This can be described as (Y = aX + b + error), where the model is designed such that (a) and (b) minimize the error when the model predicts Y for a given range of values of X. === Data product === A data product is a computer application that takes data inputs and generates outputs, feeding them back into the environment. It may be based on a model or algorithm. For instance, an application that analyzes data about customer purchase history, and uses the results to recommend other purchases the customer might enjoy. === Communication === Once data is analyzed, it may be presented in many formats to the users of the analysis to support their requirements. The users may have feedback, which results in additional analysis. When determining how to communicate the results, the analyst may consider implementing a variety of data visualization techniques to help communicate the message more clearly and efficiently to the audience. Data visualization uses information displays (graphics such as, tables and charts) to help communicate key messages contained in the data. Tables are a valuable tool by enabling the ability of a user to query and focus on specific numbers; while charts (e.g., bar charts or line charts), may help explain the quantitative messages contained in the data. == Quantitative messages == Stephen Few described eight types of quantitative messages that users may attempt to communicate from a set of data, including the associated graphs. Time-series: A single variable is captured over a period of time, such as the unemployment rate over a 10-year period. A line chart may be used to demonstrate the trend. Ranking: Categorical subdivisions are ranked in ascending or descending order, such as a ranking of sales performance (the measure) by salespersons (the category, with each salesperson a categorical subdivision) during a single period. A bar chart may be used to show the comparison across the salespersons. Part-to-whole: Categorical subdivisions are measured as a ratio to the whole (i.e., a percentage out of 100%). A pie chart or bar chart can show the comparison of ratios, such as the market share represented by competitors in a market. Deviation: Categorical subdivisions are compared against a reference, such as a comparison of actual vs. budget expenses for several departments of a business for a given time period. A bar chart can show the comparison of the actual versus the reference amount. Frequency distribution:

    Read more →
  • Data proliferation

    Data proliferation

    Data proliferation refers to the prodigious amount of data, structured and unstructured, that businesses and governments continue to generate at an unprecedented rate and the usability problems that result from attempting to store and manage that data. While originally pertaining to problems associated with paper documentation, data proliferation has become a major problem in primary and secondary data storage on computers. While digital storage has become cheaper, the associated costs, from raw power to maintenance and from metadata to search engines, have not kept up with the proliferation of data. Although the power required to maintain a unit of data has fallen, the cost of facilities which house the digital storage has tended to rise. Data proliferation has been documented as a problem for the U.S. military since August 1971, in particular regarding the excessive documentation submitted during the acquisition of major weapon systems. Efforts to mitigate data proliferation and the problems associated with it are ongoing. == Problems caused == The problem of data proliferation is affecting all areas of commerce as a result of the availability of relatively inexpensive data storage devices. This has made it very easy to dump data into secondary storage immediately after its window of usability has passed. This masks problem that could gravely affect the profitability of businesses and the efficient functioning of health services, police and security forces, local and national governments, and many other types of organizations. Data proliferation is problematic for several reasons: Difficulty when trying to find and retrieve information. At Xerox, on average it takes employees more than one hour per week to find hard-copy documents, costing $2,152 a year to manage and store them. For businesses with more than 10 employees, this increases to almost two hours per week at $5,760 per year. In large networks of primary and secondary data storage, problems finding electronic data are analogous to problems finding hard copy data. Data loss and legal liability when data is disorganized, not properly replicated, or cannot be found promptly. In April 2005, the Ameritrade Holding Corporation told 200,000 current and past customers that a tape containing confidential information had been lost or destroyed in transit. In May of the same year, Time Warner Incorporated reported that 40 tapes containing personal data on 600,000 current and former employees had been lost en route to a storage facility. In March 2005, a Florida judge hearing a $2.7 billion lawsuit against Morgan Stanley issued an "adverse inference order" against the company for "willful and gross abuse of its discovery obligations." The judge cited Morgan Stanley for repeatedly finding misplaced tapes of e-mail messages long after the company had claimed that it had turned over all such tapes to the court. Increased manpower requirements to manage increasingly chaotic data storage resources. Slower networks and application performance due to excess traffic as users search and search again for the material they need. High cost in terms of the energy resources required to operate storage hardware. A 100 terabyte system will cost up to $35,040 a year to run—not counting cooling costs. == Proposed solutions == Applications that better utilize modern technology Reductions in duplicate data (especially as caused by data movement) Improvement of metadata structures Improvement of file and storage transfer structures User education and discipline The implementation of Information Lifecycle Management solutions to eliminate low-value information as early as possible before putting the rest into actively managed long-term storage in which it can be quickly and cheaply accessed.

    Read more →
  • Alexis Spectral Data

    Alexis Spectral Data

    Alexis Spectral Data is a software developed for colour matching processes that calculates from available spectral data the colour numbers used by computers to display colours on screen. It displays the colour for each spectral reflectance curve and records the calculated trichromatic values and colour numbers along with the spectral curves. This eliminates the need to scan the samples separately with a truecolour Scanner while creating the database. The spectral data can be introduced manually as a series of reflectance values at wavelengths measured in different standard illuminants with an arbitrary but fixed increment that must be kept for each spectral curve throughout the creation of the whole database. Therefore, older UV-VIS Spectrophotometers that can't be interfaced with computers can also be used for creating the database needed for colour matching. Alexis Spectral Data determines the whiteness degree in a less time-consuming method, which permits storage and easier handling of the obtained data. Alexis Spectral Data can export the trichromatic values, calculated from the spectral curves, to Alexis Analyser, software that handles only trichromatic data. The earliest information about the development of this software comes from a paper published by a student at the University Politehnica Bucharest in 1993. The software runs on Windows based computers but not on other operating systems.

    Read more →
  • Data transformation (computing)

    Data transformation (computing)

    In computing, data transformation is the process of converting data from one format or structure into another format or structure. It is a fundamental aspect of most data integration and data management tasks such as data wrangling, data warehousing, data integration and application integration. Data transformation can be simple or complex based on the required changes to the data between the source (initial) data and the target (final) data. Data transformation is typically performed via a mixture of manual and automated steps. Tools and technologies used for data transformation can vary widely based on the format, structure, complexity, and volume of the data being transformed. A master data recast is another form of data transformation where the entire database of data values is transformed or recast without extracting the data from the database. All data in a well-designed database is directly or indirectly related to a limited set of master database tables by a network of foreign key constraints. Each foreign key constraint is dependent upon a unique database index from the parent database table. Therefore, when the proper master database table is recast with a different unique index, the directly and indirectly related data are also recast or restated. The directly and indirectly related data may also still be viewed in the original form since the original unique index still exists with the master data. Also, the database recast must be done in such a way as to not impact the applications architecture software. When the data mapping is indirect via a mediating data model, the process is also called data mediation. == Data transformation process == Data transformation can be divided into the following steps, each applicable as needed based on the complexity of the transformation required. Data discovery Data mapping Code generation Code execution Data review These steps are often the focus of developers or technical data analysts who may use multiple specialized tools to perform their tasks. The steps can be described as follows: Data discovery is the first step in the data transformation process. Typically the data is profiled using profiling tools or sometimes using manually written profiling scripts to better understand the structure and characteristics of the data and decide how it needs to be transformed. Data mapping is the process of defining how individual fields are mapped, modified, joined, filtered, aggregated etc. to produce the final desired output. Developers or technical data analysts traditionally perform data mapping since they work in the specific technologies to define the transformation rules (e.g. visual ETL tools, transformation languages). Code generation is the process of generating executable code (e.g. SQL, Python, R, or other executable instructions) that will transform the data based on the desired and defined data mapping rules. Typically, the data transformation technologies generate this code based on the definitions or metadata defined by the developers. Code execution is the step whereby the generated code is executed against the data to create the desired output. The executed code may be tightly integrated into the transformation tool, or it may require separate steps by the developer to manually execute the generated code. Data review is the final step in the process, which focuses on ensuring the output data meets the transformation requirements. It is typically the business user or final end-user of the data that performs this step. Any anomalies or errors in the data that are found and communicated back to the developer or data analyst as new requirements to be implemented in the transformation process. == Types of data transformation == === Batch data transformation === Traditionally, data transformation has been a bulk or batch process, whereby developers write code or implement transformation rules in a data integration tool, and then execute that code or those rules on large volumes of data. This process can follow the linear set of steps as described in the data transformation process above. Batch data transformation is the cornerstone of virtually all data integration technologies such as data warehousing, data migration and application integration. When data must be transformed and delivered with low latency, the term "microbatch" is often used. This refers to small batches of data (e.g. a small number of rows or a small set of data objects) that can be processed very quickly and delivered to the target system when needed. === Benefits of batch data transformation === Traditional data transformation processes have served companies well for decades. The various tools and technologies (data profiling, data visualization, data cleansing, data integration etc.) have matured and most (if not all) enterprises transform enormous volumes of data that feed internal and external applications, data warehouses and other data stores. === Limitations of traditional data transformation === This traditional process also has limitations that hamper its overall efficiency and effectiveness. The people who need to use the data (e.g. business users) do not play a direct role in the data transformation process. Typically, users hand over the data transformation task to developers who have the necessary coding or technical skills to define the transformations and execute them on the data. This process leaves the bulk of the work of defining the required transformations to the developer, which often in turn do not have the same domain knowledge as the business user. The developer interprets the business user requirements and implements the related code/logic. This has the potential of introducing errors into the process (through misinterpreted requirements), and also increases the time to arrive at a solution. This problem has given rise to the need for agility and self-service in data integration (i.e. empowering the user of the data and enabling them to transform the data themselves interactively). There are companies that provide self-service data transformation tools. They are aiming to efficiently analyze, map and transform large volumes of data without the technical knowledge and process complexity that currently exists. While these companies use traditional batch transformation, their tools enable more interactivity for users through visual platforms and easily repeated scripts. Still, there might be some compatibility issues (e.g. new data sources like IoT may not work correctly with older tools) and compliance limitations due to the difference in data governance, preparation and audit practices. === Interactive data transformation === Interactive data transformation (IDT) is an emerging capability that allows business analysts and business users the ability to directly interact with large datasets through a visual interface, understand the characteristics of the data (via automated data profiling or visualization), and change or correct the data through simple interactions such as clicking or selecting certain elements of the data. Although interactive data transformation follows the same data integration process steps as batch data integration, the key difference is that the steps are not necessarily followed in a linear fashion and typically don't require significant technical skills for completion. There are a number of companies that provide interactive data transformation tools, including Trifacta, Alteryx and Paxata. They are aiming to efficiently analyze, map and transform large volumes of data while at the same time abstracting away some of the technical complexity and processes which take place under the hood. Interactive data transformation solutions provide an integrated visual interface that combines the previously disparate steps of data analysis, data mapping and code generation/execution and data inspection. That is, if changes are made at one step (like for example renaming), the software automatically updates the preceding or following steps accordingly. Interfaces for interactive data transformation incorporate visualizations to show the user patterns and anomalies in the data so they can identify erroneous or outlying values. Once they've finished transforming the data, the system can generate executable code/logic, which can be executed or applied to subsequent similar data sets. By removing the developer from the process, interactive data transformation systems shorten the time needed to prepare and transform the data, eliminate costly errors in the interpretation of user requirements and empower business users and analysts to control their data and interact with it as needed. == Transformational languages == There are numerous languages available for performing data transformation. Many transformation languages require a grammar to be provided. In many cases, the grammar is structured using something closely resembling Backus–Naur form (BNF). There are numerous languages

    Read more →
  • ISO 15765-2

    ISO 15765-2

    ISO 15765-2, or ISO-TP (Transport Layer), is an international standard for sending data packets over a CAN bus. The protocol allows for the transport of messages that exceed the eight byte maximum payload of CAN frames. ISO-TP segments longer messages into multiple frames, adding metadata (CAN-TP Header) that allows the interpretation of individual frames and reassembly into a complete message packet by the recipient. It can carry up to 232-1 (4294967295) bytes of payload per message packet starting from the 2016 version. Prior versions were limited to a maximum payload size of 4095 bytes. In the OSI model, ISO-TP covers the layer 3 (network layer) and 4 (transport layer). The most common application for ISO-TP is the transfer of diagnostic messages with OBD-II equipped vehicles using KWP2000 and UDS, but is used broadly in other application-specific CAN implementations where one might need to send messages longer than what the CAN protocol physical layer allows (eight bytes for CAN, 64 bytes for CAN FD, and 2048 bytes for CAN-XL). ISO-TP can be operated with its own addressing as so-called Extended Addressing or without address using only the CAN ID (so-called Normal Addressing). Extended addressing uses the first data byte of each frame as an additional element of the address, reducing the application payload by one byte. For clarity the protocol description below is based on Normal Addressing with eight byte CAN frames. In total, six types of addressing are allowed by the ISO 15765-2 Protocol. ISO-TP prepends one or more metadata bytes to the payload data in the eight byte CAN frame, reducing the payload to seven or fewer bytes per frame. The metadata is called the Protocol Control Information, or PCI. The PCI is one, two or three bytes. The initial field is four bits indicating the frame type, and implicitly describing the PCI length. ISO 15765-2 is a part of ISO 15765 (headlined Road vehicles — Diagnostic communication over Controller Area Network (DoCAN)), which has the following parts: ISO 15765-1 Part 1: General information and use case definition ISO 15765-2 Part 2: Transport protocol and network layer services ISO 15765-3 Part 3: Implementation of unified diagnostic services (UDS on CAN) – replaced by ISO 14229-3 Road vehicles — Unified diagnostic services ISO 15765-4 Part 4: Requirements for emissions-related systems == List of protocol control information (PCI) field types == The ISO-TP defines four frame types: A message of seven bytes or less is sent in a single frame, with the initial byte containing the type (0) and payload length (1-7 bytes). With the 0 in the type field, this can also pass as a simpler protocol with a length-data format and is often misinterpreted as such. A message longer than 7 bytes requires segmenting the message packet over multiple frames. A segmented transfer starts with a First Frame. The PCI is two bytes in this case, with the first 4 bit field the type (type 1) and the following 12 bits the message length (excluding the type and length bytes). The recipient confirms the transfer with a flow control frame. The flow control frame has three PCI bytes specifying the interval between subsequent frames and how many consecutive frames may be sent (Block Size). For CAN FD, the ISO 15765-2 protocol has been extended for Single and First frame, to allow larger size values, but still backwards compatible with traditional ISO 15765. See CAN FD. The initial byte contains the type (type = 3) in the first four bits, and a flag in the next four bits indicating if the transfer is allowed (0 = Continue To Send, 1 = Wait, 2 = Overflow/abort). The next byte is the block size, the count of frames that may be sent before waiting for the next flow control frame. A value of zero allows the remaining frames to be sent without flow control or delay. The third byte is the minimum Separation Time (STmin), the minimum delay time between frames. STmin values up to 127 (0x7F) specify the minimum number of milliseconds to delay between frames, while values in the range 241 (0xF1) to 249 (0xF9) specify delays increasing from 100 to 900 microseconds. Note that the Separation Time is defined as the minimum time between the end of one frame to the beginning of the next. Robust implementations should be prepared to accept frames from a sender that misinterprets this as the frame repetition rate i.e. from start-of-frame to start-of-frame. Even careful implementations may fail to account for the minor effect of bit-stuffing in the physical layer. The sender transmits the rest of the message using Consecutive Frames. Each Consecutive Frame has a one byte PCI, with a four bit type (type = 2) followed by a 4-bit sequence number. The sequence number starts at 1 and increments with each frame sent (1, 2,..., F, 0, 1,...), with which lost or discarded frames can be detected. Each consecutive frame starts at 0, initially for the first set of data in the first frame will be considered as 0th data. So the first set of CF(Consecutive frames) start from 0x1. There afterwards when it reaches 0x2F, will be started from 0x20 (e.g. 0x21, 0x22, 0x23...0x2F, 0x20, 0x21...). The 12-bit length field (as indicated in the First Frame) allows up to 4095 bytes of user data in a segmented message, but in practice the typical application-specific limit is considerably lower because of receive buffer or hardware limitations. == Timing parameters == Timing parameters, such as P1 and P2 timers, have to be mentioned. == Standards == ISO 15765-2:2016 Road vehicles -- Diagnostic communication over Controller Area Network (DoCAN) -- Part 2: Transport protocol and network layer services

    Read more →
  • Intent-based network

    Intent-based network

    Intent-Based Networking (IBN) is an approach to network management that shifts the focus from manually configuring individual devices to specifying desired outcomes or business objectives, referred to as "intents". == Description == Rather than relying on low-level commands to configure the network, administrators define these high-level intents, and the network dynamically adjusts itself to meet these requirements. IBN simplifies the management of complex networks by ensuring that the network infrastructure aligns with the desired operational goals. For example, an implementer can explicitly state a network purpose with a policy such as "Allow hosts A and B to communicate with X bandwidth capacity" without the need to understand the detailed mechanisms of the underlying devices (e.g. switches), topology or routing configurations. == Architecture == Advances in Natural Language Understanding (NLU) systems, along with neural network-based algorithms like BERT, RoBERTa, GLUE, and ERNIE, have enabled the conversion of user queries into structured representations that can be processed by automated services. This capability is crucial for managing the increasing complexity of network services. Intent-Based Networking (IBN) leverages these advancements to simplify network management by abstracting network services, reducing operational complexity, and lowering costs. A proposed three-layered architecture integrates intent-based automation into network management systems. In the business layer, intents are based on Key Performance Indicators (KPIs) and Service Level Agreements (SLAs), reflecting business objectives. The intent layer evaluates and re-plans actions dynamically, where a Knowledge module abstracts and reasons about intents, while an Agent interfaces with network objects to execute actions. The data layer observes network objects, updates topology information, and interacts with the Knowledge and Agent modules to ensure accurate and timely responses to network changes. At the bottom, the network layer contains the physical infrastructure, transforming network data into a usable format for the intent layer to act upon.

    Read more →
  • Region Based Convolutional Neural Networks

    Region Based Convolutional Neural Networks

    Region-based Convolutional Neural Networks (R-CNN) are a family of machine learning models for computer vision, and specifically object detection and localization. The original goal of R-CNN was to take an input image and produce a set of bounding boxes as output, where each bounding box contains an object and also the category (e.g. car or pedestrian) of the object. In general, R-CNN architectures perform selective search over feature maps outputted by a CNN. R-CNN has been extended to perform other computer vision tasks, such as: tracking objects from a drone-mounted camera, locating text in an image, and enabling object detection in Google Lens. Mask R-CNN is also one of seven tasks in the MLPerf Training Benchmark, which is a competition to speed up the training of neural networks. == History == The following covers some of the versions of R-CNN that have been developed. November 2013: R-CNN. April 2015: Fast R-CNN. June 2015: Faster R-CNN. March 2017: Mask R-CNN. December 2017: Cascade R-CNN is trained with increasing Intersection over Union (IoU, also known as the Jaccard index) thresholds, making each stage more selective against nearby false positives. June 2019: Mesh R-CNN adds the ability to generate a 3D mesh from a 2D image. == Architecture == For review articles see. === Selective search === Given an image (or an image-like feature map), selective search (also called Hierarchical Grouping) first segments the image by the algorithm in (Felzenszwalb and Huttenlocher, 2004), then performs the following: Input: (colour) image Output: Set of object location hypotheses L Segment image into initial regions R = {r1, ..., rn} using Felzenszwalb and Huttenlocher (2004) Initialise similarity set S = ∅ foreach Neighbouring region pair (ri, rj) do Calculate similarity s(ri, rj) S = S ∪ s(ri, rj) while S ≠ ∅ do Get highest similarity s(ri, rj) = max(S) Merge corresponding regions rt = ri ∪ rj Remove similarities regarding ri: S = S \ s(ri, r∗) Remove similarities regarding rj: S = S \ s(r∗, rj) Calculate similarity set St between rt and its neighbours S = S ∪ St R = R ∪ rt Extract object location boxes L from all regions in R === R-CNN === With R-CNN, prediction follows a two-step process. A preprocessing selective search step generates a large set of candidate objects (typically as many as 2000), known as regions of interest (ROI). These are forwarded to a CNN, which predicts an object class score and bounding box estimate, independently for each ROI. Importantly, the ROIs are heavily filtered to remove excess candidates. This is achieved using two mechanism. Filtering begins by removing ROIs assigned to the background category. This is a specialized category, which is scored by the CNN alongside other categories. An unfortunate reality is that remaining ROIs typically suffer from heavy duplication. Namely, multiple ROIs that cover same objects in the image are all assigned non-background categories. This is resolved by a heuristic non-maximum suppression (NMS) step. === Fast R-CNN === While the original R-CNN independently computed the neural network features on each of as many as two thousand regions of interest, Fast R-CNN runs the neural network once on the whole image. At the end of the network is a ROIPooling module, which slices out each ROI from the network's output tensor, reshapes it, and classifies it. As in the original R-CNN, the Fast R-CNN uses selective search to generate its region proposals. === Faster R-CNN === While Fast R-CNN used selective search to generate ROIs, Faster R-CNN integrates the ROI generation into the neural network itself. === Mask R-CNN === While previous versions of R-CNN focused on object detections, Mask R-CNN adds instance segmentation. Mask R-CNN also replaced ROIPooling with a new method called ROIAlign, which can represent fractions of a pixel.

    Read more →
  • Kaeli McEwen

    Kaeli McEwen

    Kaeli Mae McEwen (born May 10, 2000), known professionally as Kaeli Mae, is an American content creator and social media influencer from Seattle, Washington, known for her TikTok videos about cleaning and organizing and contributing to the "Clean Girl" Internet aesthetic. She has Type 1 diabetes. Her fame was attributed to an increase in use of the name Kaeli for newborn girls in the United States in 2023.

    Read more →
  • HTTP Strict Transport Security

    HTTP Strict Transport Security

    HTTP Strict Transport Security (HSTS) is a policy mechanism that helps to protect websites against man-in-the-middle attacks such as protocol downgrade attacks and cookie hijacking. It allows web servers to declare that web browsers (or other complying user agents) should automatically interact with it using only HTTPS connections, which provide Transport Layer Security (TLS/SSL), unlike the insecure HTTP used alone. HSTS is an IETF standards track protocol and is specified in RFC 6797. The HSTS Policy is communicated by the server to the user agent via an HTTP response header field named Strict-Transport-Security. HSTS Policy specifies a period of time during which the user agent should only access the server in a secure fashion. Websites using HSTS often do not accept clear text HTTP, either by rejecting connections over HTTP or systematically redirecting users to HTTPS (though this is not required by the specification). The consequence of this is that a user-agent not capable of doing TLS will not be able to connect to the site. The protection normally only applies after a user has visited the site at least once, relying on the principle of "trust on first use". The way this protection works is that when a user entering or selecting an HTTP (not HTTPS) URL to the site, the client, such as a Web browser, will automatically upgrade to HTTPS without making an HTTP request, thereby preventing any HTTP man-in-the-middle attack from occurring. To counteract this problem, an HSTS preload list maintained by Google Chrome and used by other major web browsers is maintained. If a domain is on this list, the browser skips the initial request and encrypts all communication immediately. Additional domains can be registered at no cost. == Specification history == The HSTS specification was published as RFC 6797 on 19 November 2012 after being approved on 2 October 2012 by the IESG for publication as a Proposed Standard RFC. The authors originally submitted it as an Internet Draft on 17 June 2010. With the conversion to an Internet Draft, the specification name was altered from "Strict Transport Security" (STS) to "HTTP Strict Transport Security", because the specification applies only to HTTP. The HTTP response header field defined in the HSTS specification however remains named "Strict-Transport-Security". The last so-called "community version" of the then-named "STS" specification was published on 18 December 2009, with revisions based on community feedback. The original draft specification by Jeff Hodges from PayPal, Collin Jackson, and Adam Barth was published on 18 September 2009. The HSTS specification is based on original work by Jackson and Barth as described in their paper "ForceHTTPS: Protecting High-Security Web Sites from Network Attacks". Additionally, HSTS is the realization of one facet of an overall vision for improving web security, put forward by Jeff Hodges and Andy Steingruebl in their 2010 paper The Need for Coherent Web Security Policy Framework(s). == HSTS mechanism overview == A server implements an HSTS policy by supplying a header over an HTTPS connection (HSTS headers over HTTP are ignored). For example, a server could send a header such that future requests to the domain for the next year (max-age is specified in seconds; 31,536,000 is equal to one non-leap year) use only HTTPS: Strict-Transport-Security: max-age=31536000. When a web application issues HSTS Policy to user agents, conformant user agents behave as follows: Automatically turn any insecure links referencing the web application into secure links (e.g. http://example.com/some/page/ will be modified to https://example.com/some/page/ before accessing the server). If the security of the connection cannot be ensured (e.g. the server's TLS certificate is not trusted), the user agent must terminate the connection and should not allow the user to access the web application. This helps protect web application users against some passive (eavesdropping) and active network attacks. A man-in-the-middle attacker has a greatly reduced ability to intercept requests and responses between a user and a web application server while the user's browser has HSTS Policy in effect for that web application. == Applicability == The most important security vulnerability that HSTS can fix is SSL-stripping man-in-the-middle attacks, first publicly introduced by Moxie Marlinspike in his 2009 BlackHat Federal talk "New Tricks For Defeating SSL In Practice". The SSL (and TLS) stripping attack works by transparently converting a secure HTTPS connection into a plain HTTP connection. The user can see that the connection is insecure, but crucially there is no way of knowing whether the connection should be secure. At the time of Marlinspike's talk, many websites did not use TLS/SSL, therefore there was no way of knowing (without prior knowledge) whether the use of plain HTTP was due to an attack, or simply because the website had not implemented TLS/SSL. Additionally, no warnings are presented to the user during the downgrade process, making the attack fairly subtle to all but the most vigilant. Marlinspike's sslstrip tool, presented at Black Hat DC 2009, fully automates the attack. HSTS addresses this problem by informing the browser that connections to the site should always use TLS/SSL. The HSTS header can be stripped by the attacker if this is the user's first visit. Google Chrome, Mozilla Firefox, Internet Explorer, and Microsoft Edge attempt to limit this problem by including a "pre-loaded" list of HSTS sites. Unfortunately this solution cannot scale to include all websites on the internet. See limitations, below. HSTS can also help to prevent having one's cookie-based website login credentials stolen by widely available tools such as Firesheep. Because HSTS is time limited, it is sensitive to attacks involving shifting the victim's computer time e.g. using false NTP packets. == Limitations == The initial request remains unprotected from active attacks if it uses an insecure protocol such as plain HTTP or if the URI for the initial request was obtained over an insecure channel. The same applies to the first request after the activity period specified in the advertised HSTS Policy max-age (sites should set a period of several days or months depending on user activity and behavior). === Solutions with preload list === Google Chrome, Mozilla Firefox, and Internet Explorer/Microsoft Edge address this limitation by implementing a "HSTS preloaded list", which is a list that contains known sites supporting HSTS. This list is distributed with the browser so that it uses HTTPS for the initial request to the listed sites as well. As previously mentioned, these pre-loaded lists cannot scale to cover the entire Web. A potential solution might be achieved by using DNS records to declare HSTS Policy, and accessing them securely via DNSSEC, optionally with certificate fingerprints to ensure validity (which requires running a validating resolver to avoid last mile issues). Junade Ali has noted that HSTS is ineffective against the use of false domains; by using DNS-based attacks, it is possible for a man-in-the-middle interceptor to serve traffic from an artificial domain which is not on the HSTS Preload list, this can be made possible by DNS Spoofing Attacks, or simply a domain name that misleadingly resembles the real domain name such as www.example.org instead of www.example.com. Even with an HSTS preloaded list, HSTS cannot prevent advanced attacks against TLS itself, such as the BEAST or CRIME attacks introduced by Juliano Rizzo and Thai Duong. Attacks against TLS itself are orthogonal to HSTS policy enforcement. Neither can it protect against attacks on the server - if someone compromises it, it will happily serve any content over TLS. === Privacy issues === HSTS can be used to near-indelibly tag visiting browsers with recoverable identifying data (supercookies) which can persist in and out of browser "incognito" privacy modes. By creating a web page that makes multiple HTTP requests to selected domains, for example, if twenty browser requests to twenty different domains are used, theoretically over one million visitors can be distinguished (220) due to the resulting requests arriving via HTTP vs. HTTPS; the latter being the previously recorded binary "bits" established earlier via HSTS headers. == Browser support == Chromium and Google Chrome since version 4.0.211.0 Firefox since version 4; with Firefox 17, Mozilla integrates a list of websites supporting HSTS. Opera since version 12 Safari since OS X Mavericks (version 10.9, late 2013) Internet Explorer 11 on Windows 8.1 and Windows 7 with KB3058515 installed (Released as a Windows Update in June 2015) Microsoft Edge and Internet Explorer 11 on Windows 10 BlackBerry 10 Browser and WebView since BlackBerry OS 10.3.3. == Deployment best practices == Depending on the actual deployment there are certain threats (e.g. cookie injection attacks) t

    Read more →