Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment. Some scholars have suggested that in order to effectively confront threats related to cybersecurity, proper security should be coded or "baked in" to the systems. With security being designed into the software, this ensures that there will be protection against insider attacks and reduces the threat to application security. Implementing secure coding practices is part of the secure by design approach to security engineering. == Buffer-overflow prevention == Buffer overflows, a common software security vulnerability, happen when a process tries to store data beyond a fixed-length buffer. For example, if there are 8 slots to store items in, there will be a problem if there is an attempt to store 9 items. In computer memory the overflowed data may overwrite data in the next location which can result in a security vulnerability (stack smashing) or program termination (segmentation fault). An example of a C program prone to a buffer overflow is If the user input is larger than the destination buffer, a buffer overflow will occur. To fix this unsafe program, use strncpy to prevent a possible buffer overflow. Another secure alternative is to dynamically allocate memory on the heap using malloc. In the above code snippet, the program attempts to copy the contents of src into dst, while also checking the return value of malloc() to ensure that enough memory was able to be allocated for the destination buffer. == Format-string attack prevention == A Format String Attack is when a malicious user supplies specific inputs that will eventually be entered as an argument to a function that performs formatting, such as printf(). The attack involves the adversary reading from or writing to the stack. The C printf function writes output to stdout. If the parameter of the printf function is not properly formatted, several security bugs can be introduced. Below is a program that is vulnerable to a format string attack. A malicious argument passed to the program could be "%s%s%s%s%s%s%s", which can crash the program from improper memory reads. == Integer-overflow prevention == Integer overflow occurs when an arithmetic operation results in an integer too large to be represented within the available space. A program which does not properly check for integer overflow introduces potential software bugs and exploits. Below is a function in C++ which attempts to confirm that the sum of x and y is less than or equal to a defined value MAX: The problem with the code is it does not check for integer overflow on the addition operation. If the sum of x and y is greater than the maximum possible value of an unsigned int, the addition operation will overflow and perhaps result in a value less than or equal to MAX, even though the sum of x and y is greater than MAX. Below is a function which checks for overflow by confirming the sum is greater than or equal to both x and y. If the sum did overflow, the sum would be less than x or less than y. == Path traversal prevention == Path traversal is a vulnerability whereby paths provided from an untrusted source are interpreted in such a way that unauthorised file access is possible. For example, consider a script that fetches an article by taking a filename, which is then read by the script and parsed. Such a script might use the following hypothetical URL to retrieve an article about dog food: https://www.example.net/cgi-bin/article.sh?name=dogfood.html If the script has no input checking, instead trusting that the filename is always valid, a malicious user could forge a URL to retrieve configuration files from the web server: https://www.example.net/cgi-bin/article.sh?name=../../../../../etc/passwd Depending on the script, this may expose the /etc/passwd file, which on Unix-like systems contains (among others) user IDs, their login names, home directory paths and shells. (See SQL injection for a similar attack.) == Regulatory drivers == Secure coding practices are increasingly mandated by regulatory frameworks governing the development and maintenance of software systems that process sensitive data. The Health Insurance Portability and Accountability Act (HIPAA) Security Rule requires covered entities to protect the integrity of protected health information through technical safeguards under 45 CFR 164.312(c)(1) and to implement mechanisms to authenticate electronic protected health information under 45 CFR 164.312(c)(2). The Payment Card Industry Data Security Standard (PCI DSS) version 4.0 Requirement 6.2 mandates that custom software is developed securely, including training developers in secure coding techniques (6.2.2), reviewing custom code for vulnerabilities before release (6.2.3), and addressing common software attacks in development practices (6.2.4).
Association for Computational Linguistics
The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on natural language processing. Its namesake conference is one of the primary high impact conferences for natural language processing research, along with EMNLP. The conference is held each summer in locations where significant computational linguistics research is carried out. It was founded in 1962, originally named the Association for Machine Translation and Computational Linguistics (AMTCL). It became the ACL in 1968. The ACL has a European (EACL), a North American (NAACL), and an Asian (AACL) chapter. == History == The ACL was founded in 1962 as the Association for Machine Translation and Computational Linguistics (AMTCL). The initial membership was about 100. In 1965, the AMTCL took over the journal Mechanical Translation and Computational Linguistics. This journal was succeeded by many other journals: the American Journal of Computational Linguistics (1974–1978, 1980–1983), and then Computational Linguistics (1984–present). Since 1988, the journal has been published for the ACL by MIT Press. The annual meeting was first held in 1963 in conjunction with the Association for Computing Machinery National Conference. The annual meeting was, for a long time, relatively informal and did not publish anything longer than abstracts. By 1968, the society took on its current name, the Association for Computational Linguistics (ACL). The publication of the annual meeting's Proceedings of the ACL began in 1979 and gradually matured into its modern form. Many of the meetings were held in conjunction with the Linguistic Society of America, and a few with the American Society for Information Science and the Cognitive Science Society. The United States government sponsored much research from 1989 to 1994, characterized by an increase in author retention rates and an increase in research in some key topics, such as speech recognition, in ACL. By the 21st century, it was able to maintain authors at a high rate who coalesced in a more stable arrangement around individual research topics. In 1991, the group published a prototype for a text generator based on the universal grammar theory of Noam Chomsky. The system, nicknamed Parrot, relied on a finite set of syntactic transformations and a hand-curated lexicon. Despite some initial success, including experimentation with morpheme syntactics, funding halted after the research team encountered intractable difficulties with inflection and abstract locutions. == Annual Meeting of the ACL == Every year, the ACL holds the Annual Meeting of the ACL. The location lies in Europe in years zero modulo three, North America in years one modulo three, and Asia–Australia in years two modulo three. In 2020, the Annual Meeting received for the first time more submissions from China than the United States. == Activities == The ACL organizes several of the top conferences and workshops in the field of computational linguistics and natural language processing. These include: Annual Meeting of the Association for Computational Linguistics (ACL), the flagship conference of the organization Empirical Methods in Natural Language Processing (EMNLP) International Joint Conference on Natural Language Processing (IJCNLP), held jointly one of the other conferences on a rotating basis Conference on Computational Natural Language Learning (CoNLL) Lexical and Computational Semantics and Semantic Evaluation (SemEval) Joint Conference on Lexical and Computational Semantics (SEM) Workshop on Statistical Machine Translation (WMT) Besides conferences, the ACL also sponsors the journals Computational Linguistics and Transactions of the Association for Computational Linguistics (TACL). Papers and other presentations at ACL and ACL-affiliated venues are archived online in the open-access ACL Anthology. == Special Interest Groups == ACL has a large number of Special Interest Groups (SIGs), focusing on specific areas of natural language processing. Some current SIGs within ACL are: == Presidents == Each year, the ACL elects a distinguished computational linguist who becomes vice-president of the organization in the next calendar year and president one year later. Recent ACL presidents are:
Bioz
Bioz is a search engine for life science experimentation. == History == Bioz was founded by Karin Lachmi and Daniel Levitt. Lachmi is a scientist who completed her postdoc in molecular and cellular biology at the Stanford University School of Medicine. During her lab work she found little available data regarding preferable lab tools, reagents and related products for experimentation. There are 50,000 vendors selling 300 million scientific products. She decided to start the company in order to provide researchers with adequate information for that purpose. Co-founder Daniel Levitt is an entrepreneur who sold his company WebAppoint to Microsoft in the year 2000. He also co-founded the company StemRad. At Bioz, Lachmi serves as the Chief Scientific Officer and Levitt serves as the chief executive officer. Bioz claims to have over a million researcher-users from 196 countries. Among the investors are Esther Dyson and the Stanford-StartX Fund. The company's advisory board includes Nobel Laureates in Chemistry Michael Levitt, Roger Kornberg, and Ada Yonath. == Technology == The company uses artificial intelligence, machine learning and natural language processing in order to extract experimentation data from scientific articles, such as the products that researchers used, the companies that supply the products, the protocol conditions that researchers selected, and the types of experiments and techniques. The algorithm ranks products based on how frequently they were used by researchers in their experiments, how recently a product was used, and the impact factor of the journal. The algorithm's output is a Bioz stars score for each product that was mentioned in an article. Bioz is a data-driven platform for product recommendations, which is contrary to platforms such as TripAdvisor and OpenTable that are based on user-generated reviews and ratings. The recommendations and scoring system that the company has developed are meant to assist researchers with the process of developing future medications and finding cures for diseases. They are guided towards products and techniques that were previously used by other researchers when planning and performing experiments. The company's revenue is based on selling SaaS subscriptions to researchers in biopharma companies. They also charge product suppliers for content syndication.
Training, validation, and test data sets
In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data. These input data used to build the model are usually divided into multiple data sets. In particular, three data sets are commonly used in different stages of the creation of the model: training, validation, and testing sets. The model is initially fit on a training data set, which is a set of examples used to fit the parameters (e.g. weights of connections between neurons in artificial neural networks) of the model. The model (e.g. a naive Bayes classifier) is trained on the training data set using a supervised learning method, for example using optimization methods such as gradient descent or stochastic gradient descent. In practice, the training data set often consists of pairs of an input vector (or scalar) and the corresponding output vector (or scalar), where the answer key is commonly denoted as the target (or label). The current model is run with the training data set and produces a result, which is then compared with the target, for each input vector in the training data set. Based on the result of the comparison and the specific learning algorithm being used, the parameters of the model are adjusted. The model fitting can include both variable selection and parameter estimation. Successively, the fitted model is used to predict the responses for the observations in a second data set called the validation data set. The validation data set provides an unbiased evaluation of a model fit on the training data set while tuning the model's hyperparameters (e.g. the number of hidden units—layers and layer widths—in a neural network). Validation data sets can be used for regularization by early stopping (stopping training when the error on the validation data set increases, as this is a sign of over-fitting to the training data set). This simple procedure is complicated in practice by the fact that the validation data set's error may fluctuate during training, producing multiple local minima. This complication has led to the creation of many ad-hoc rules for deciding when over-fitting has truly begun. Finally, the test data set is a data set used to provide an unbiased evaluation of a model fit on the training data set. When the data in the test data set has never been used (for example in cross-validation), the test data set is called a holdout data set. The term "validation set" is sometimes used instead of "test set" in some literature (e.g., if the original data set was partitioned into only two subsets, the test set might be referred to as the validation set). Deciding the sizes and strategies for data set division in training, test and validation sets is very dependent on the problem and data available. == Training data set == A training data set is a data set of examples used during the learning process and is used to fit the parameters (e.g., weights) of, for example, a classifier. For classification tasks, a supervised learning algorithm looks at the training data set to determine, or learn, the optimal combinations of variables that will generate a good predictive model. The goal is to produce a trained (fitted) model that generalizes well to new, unknown data. The fitted model is evaluated using “new” examples from the held-out data sets (validation and test data sets) to estimate the model’s accuracy in classifying new data. To reduce the risk of issues such as over-fitting, the examples in the validation and test data sets should not be used to train the model. Most approaches that search through training data for empirical relationships tend to overfit the data, meaning that they can identify and exploit apparent relationships in the training data that do not hold in general. When a training set is continuously expanded with new data, then this is incremental learning. == Validation data set == A validation data set is a data set of examples used to tune the hyperparameters (i.e. the architecture) of a model. It is sometimes also called the development set or the "dev set". An example of a hyperparameter for artificial neural networks includes the number of hidden units in each layer. It, as well as the testing set (as mentioned below), should follow the same probability distribution as the training data set. In order to avoid overfitting, when any classification parameter needs to be adjusted, it is necessary to have a validation data set in addition to the training and test data sets. For example, if the most suitable classifier for the problem is sought, the training data set is used to train the different candidate classifiers, the validation data set is used to compare their performances and decide which one to take and, finally, the test data set is used to obtain the performance characteristics such as accuracy, sensitivity, specificity, F-measure, and so on. The validation data set functions as a hybrid: it is training data used for testing, but neither as part of the low-level training nor as part of the final testing. The basic process of using a validation data set for model selection (as part of training data set, validation data set, and test data set) is: Since our goal is to find the network having the best performance on new data, the simplest approach to the comparison of different networks is to evaluate the error function using data which is independent of that used for training. Various networks are trained by minimization of an appropriate error function defined with respect to a training data set. The performance of the networks is then compared by evaluating the error function using an independent validation set, and the network having the smallest error with respect to the validation set is selected. This approach is called the hold out method. Since this procedure can itself lead to some overfitting to the validation set, the performance of the selected network should be confirmed by measuring its performance on a third independent set of data called a test set. An application of this process is in early stopping, where the candidate models are successive iterations of the same network, and training stops when the error on the validation set grows, choosing the previous model (the one with minimum error). == Test data set == A test data set is a data set that is independent of the training data set, but that follows the same probability distribution as the training data set. A test set is therefore a set of examples used only to assess the performance (i.e. generalization) of a specified classifier on unseen data. To do this, the model is used to predict classifications of examples in the test set. Those predictions are compared to the examples' true classifications to assess the model's accuracy. If a model fit to the training and validation data set also fits the test data set well, minimal overfitting has taken place (see figure below). A better fitting of the training or validation data sets as opposed to the test data set usually points to overfitting. In the scenario where a data set has a low number of samples, it is usually partitioned into a training set and a validation data set, where the model is trained on the training set and refined using the validation set to improve accuracy, but this approach will lead to overfitting. The holdout method can also be employed, where the test set is used at the end, after training on the training set. Other techniques, such as cross-validation and bootstrapping, are used on small data sets. The bootstrap method generates numerous simulated data sets of the same size by randomly sampling with replacement from the original data, allowing the random data points to serve as test sets for evaluating model performance. Cross-validation splits the data set into multiple folds, with a single sub-fold used as test data; the model is trained on the remaining folds, and all folds are cross-validated (with results averaged and models consolidated) to estimate final model performance. Note that some sources advise against using a single split, as it can lead to overfitting as well as biased model performance estimates. For this reason, data sets are split into three partitions: training, validation and test data sets. The standard machine learning practice is to train on the training set and tune hyperparameters using the validation set, where the validation process selects the model with the lowest validation loss, which is then tested on the test data set (normally held out) to assess the final model. The holdout method for the test set reduces computation by avoiding using the test set after each epoch. The test data set should never be used for validating the training model or fine-tuning hyperparameters, as it provides an accurate and honest evaluation of the model's final performance on unseen dat
Dimensionality reduction
Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse of dimensionality, and analyzing the data is usually computationally intractable. Dimensionality reduction is common in fields that deal with large numbers of observations and/or large numbers of variables, such as signal processing, speech recognition, neuroinformatics, and bioinformatics. Methods are commonly divided into linear and nonlinear approaches. Linear approaches can be further divided into feature selection and feature extraction. Dimensionality reduction can be used for noise reduction, data visualization, cluster analysis, or as an intermediate step to facilitate other analyses. == Feature selection == The process of feature selection aims to find a suitable subset of the input variables (features, or attributes) for the task at hand. The three strategies are: the filter strategy (e.g., information gain), the wrapper strategy (e.g., accuracy-guided search), and the embedded strategy (features are added or removed while building the model based on prediction errors). Data analysis such as regression or classification can be done in the reduced space more accurately than in the original space. == Feature projection == Feature projection (also called feature extraction) transforms the data from the high-dimensional space to a space of fewer dimensions. The data transformation may be linear, as in principal component analysis (PCA), but many nonlinear dimensionality reduction techniques also exist. For multidimensional data, tensor representation can be used in dimensionality reduction through multilinear subspace learning. === Principal component analysis (PCA) === The main linear technique for dimensionality reduction, principal component analysis, performs a linear mapping of the data to a lower-dimensional space in such a way that the variance of the data in the low-dimensional representation is maximized. In practice, the covariance (and sometimes the correlation) matrix of the data is constructed and the eigenvectors on this matrix are computed. The eigenvectors that correspond to the largest eigenvalues (the principal components) can now be used to reconstruct a large fraction of the variance of the original data. Moreover, the first few eigenvectors can often be interpreted in terms of the large-scale physical behavior of the system, because they often contribute the vast majority of the system's energy, especially in low-dimensional systems. Still, this must be proved on a case-by-case basis as not all systems exhibit this behavior. The original space (with dimension of the number of points) has been reduced (with data loss, but hopefully retaining the most important variance) to the space spanned by a few eigenvectors. === Non-negative matrix factorization (NMF) === NMF decomposes a non-negative matrix to the product of two non-negative ones, which has been a promising tool in fields where only non-negative signals exist, such as astronomy. NMF is well known since the multiplicative update rule by Lee & Seung, which has been continuously developed: the inclusion of uncertainties, the consideration of missing data and parallel computation, sequential construction which leads to the stability and linearity of NMF, as well as other updates including handling missing data in digital image processing. With a stable component basis during construction, and a linear modeling process, sequential NMF is able to preserve the flux in direct imaging of circumstellar structures in astronomy, as one of the methods of detecting exoplanets, especially for the direct imaging of circumstellar discs. In comparison with PCA, NMF does not remove the mean of the matrices, which leads to physical non-negative fluxes; therefore NMF is able to preserve more information than PCA as demonstrated by Ren et al. === Kernel PCA === Principal component analysis can be employed in a nonlinear way by means of the kernel trick. The resulting technique is capable of constructing nonlinear mappings that maximize the variance in the data. The resulting technique is called kernel PCA. === Graph-based kernel PCA === Other prominent nonlinear techniques include manifold learning techniques such as Isomap, locally linear embedding (LLE), Hessian LLE, Laplacian eigenmaps, and methods based on tangent space analysis. These techniques assume that the high-dimensional input data lies near a low-dimensional manifold embedded in the ambient space, and construct a low-dimensional representation using a cost function that retains local properties of the data; they can be viewed as defining a graph-based kernel for Kernel PCA. More recently, techniques have been proposed that, instead of defining a fixed kernel, try to learn the kernel using semidefinite programming. The most prominent example of such a technique is maximum variance unfolding (MVU). The central idea of MVU is to exactly preserve all pairwise distances between nearest neighbors (in the inner product space) while maximizing the distances between points that are not nearest neighbors. An alternative approach to neighborhood preservation is through the minimization of a cost function that measures differences between distances in the input and output spaces. Important examples of such techniques include: classical multidimensional scaling, which is identical to PCA; Isomap, which uses geodesic distances in the data space; diffusion maps, which use diffusion distances in the data space; t-distributed stochastic neighbor embedding (t-SNE), which minimizes the divergence between distributions over pairs of points; and curvilinear component analysis. A different approach to nonlinear dimensionality reduction is through the use of autoencoders, a special kind of feedforward neural networks with a bottleneck hidden layer. The training of deep encoders is typically performed using a greedy layer-wise pre-training (e.g., using a stack of restricted Boltzmann machines) that is followed by a finetuning stage based on backpropagation. === Linear discriminant analysis (LDA) === Linear discriminant analysis (LDA) is a generalization of Fisher's linear discriminant, a method used in statistics, pattern recognition, and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events. === Generalized discriminant analysis (GDA) === GDA deals with nonlinear discriminant analysis using kernel function operator. The underlying theory is close to the support-vector machines (SVM) insofar as the GDA method provides a mapping of the input vectors into high-dimensional feature space. Similar to LDA, the objective of GDA is to find a projection for the features into a lower dimensional space by maximizing the ratio of between-class scatter to within-class scatter. === Autoencoder === Autoencoders can be used to learn nonlinear dimension reduction functions and codings together with an inverse function from the coding to the original representation. === t-SNE === T-distributed Stochastic Neighbor Embedding (t-SNE) is a nonlinear dimensionality reduction technique useful for the visualization of high-dimensional datasets. It is not recommended for use in analysis such as clustering or outlier detection since it does not necessarily preserve densities or distances well. === UMAP === Uniform manifold approximation and projection (UMAP) is a nonlinear dimensionality reduction technique. Visually, it is similar to t-SNE, but it assumes that the data is uniformly distributed on a locally connected Riemannian manifold and that the Riemannian metric is locally constant or approximately locally constant. == Dimension reduction == For high-dimensional datasets, dimension reduction is usually performed prior to applying a k-nearest neighbors (k-NN) algorithm in order to mitigate the curse of dimensionality. Feature extraction and dimension reduction can be combined in one step, using principal component analysis (PCA), linear discriminant analysis (LDA), canonical correlation analysis (CCA), or non-negative matrix factorization (NMF) techniques to pre-process the data, followed by clustering via k-NN on feature vectors in a reduced-dimension space. In machine learning, this process is also called low-dimensional embedding. For high-dimensional datasets (e.g., when performing similarity search on live video streams, DNA data, or high-dimensional time series), running a fast approximate k-NN search using locality-sensitive hashing, random projection, "sketches", or other high-dimensional similarity search techniques from the VLDB conference toolbox may be the only fe
Necrobotics
Necrobotics is the practice of using biotic materials (or dead organisms) as robotic components. Necrobotics can serve as an alternative to mechanical components that are difficult to manufacture by using biological components designed by natural selection in order to exploit the highly developed selective design implemented in biological lifeforms via the process of evolution. In July 2022, researchers in the Preston Innovation Lab at Rice University in Houston, Texas published a paper in Advanced Science introducing the concept and demonstrating its capability by repurposing dead spiders as robotic grippers and applying pressurized air to activate their gripping arms. In April 2025 researchers at Shinshu University created a “bio-hybrid drone” using silk-worm moth antennae to detect the source of a smell. In November 2025 researchers at McGill University demonstrated the use of a mosquito proboscis as a fine nozzle in experimental 3D printing. Necrobotics utilizes the spider's organic hydraulic system and their compact legs to create an efficient and simple gripper system. The necrobotic spider gripper is capable of lifting small and light objects, thereby serving as an alternative to complex and costly small mechanical grippers. == Background == The main appeal of the spider's body in necrobotics is its compact leg mechanism and use of hydraulic pressure. The spider's anatomy utilizes a simple hydraulic (fluid) pressure system. Spider legs have flexor muscles that naturally constrict their legs when relaxed. A force is required to straighten and extend their legs, which spiders accomplish by pumping hemolymph fluid (blood) through their joints as a means of hydraulic pressure. It takes no external power to curl their legs due to their flexor muscles' natural curled state. In July 2022, researchers in the Preston Innovation Lab at Rice University published a paper detailing their experiments with the gripper. Although dead spiders no longer produce hemolymph, Te Faye Yap (lead author and mechanical engineering graduate) found that pumping air through a needle into the spider's cephalothorax (main body) accomplishes the same results as hemolymph. The original hydraulic (fluid) system is essentially converted into a pneumatic (air) system. == Fabrication == Obtain a spider Euthanize the spider using a cold temperature of around -4°C for 5-7 days Insert a 25 gauge hypodermic needle into the spider's cephalothorax (main body) Apply glue around the needle to form a seal and allow it to dry Connect a syringe or pump to the needle Extend the spider's legs by pumping air in == Testing and Data == === Internal Force Versus Gripping Force === The typical pressure in a resting spider's legs ranges from 4 kPa to 6.1 kPa. Researchers extended the legs by increasing the spider's internal pressure to 5.5 kPa. Pumping air into the body increases the internal pressure, causing the legs to expand. Pumping air out of the body decreases internal pressure, causing the legs to contract due to their flexor leg muscles. When the internal pressure decreases to 0 kPa, the gripper would be fully closed, allowing for the gripper to grasp objects. This action demonstrates that as internal pressure decreases, the gripping force increases. Inversely, when internal pressure increases, the gripping force decreases. By gripping individual weighted acetate beads, it is found that the necrobotic gripper achieves a maximum gripping force of 0.35 milinewtons. === Spider Weight Versus Gripping Force === To estimate the gripping forces of smaller and larger spiders, researchers created a plot to predict the gripping force relative to the size of the spider. The wolf spider's body weight is relatively equal to the gripping force of its legs. The mass of the gripper is 33.5 mg and can lift 1.3 times its body weight (43.6 mg or 0.35 mN). However, with larger spiders, the gripping force relative to body weight decreases. For example, a 200-gram goliath birdeater is predicted to lift 10% of its weight (20 grams or 196 mN). Though there is an inverse relationship between spider mass and gripping force, larger spiders exert greater gripping forces than smaller spiders. === Gripper Lifespan === The necrobotic gripper's functionality is entirely reliant on the structural integrity of the spider. If the spider were to break down easily and frequently, the gripper would not be practical. Using cyclic testing, a series of repeated actions, it is found that the necrobotic gripper can actuate 700 to 1000 times. After 1000 cycles, cracks begin forming on the membrane of the leg joints due to dehydration. Weakened and decomposing joints lead to frequent breakage and replacement, thereby serving as an obstacle in applying necrobotics to real-world scenarios. One theorized fix to this issue is applying beeswax or a lubricant to the joints. Researchers found that over 10 days, the mass of an uncoated spider decreased 17 times more than the mass of a spider coated with beeswax. Lubricating joints combats dehydration and slows the loss of organic material. == Constraints == With the usage of organic material, there is a higher chance of the component decomposing and breaking down as opposed to traditional mechanical systems. There may be additional work and management required to replace these grippers if they fail. Additionally, organic inconsistencies with the spiders will yield inaccurate results. Not all wolf spiders develop the same, so gripping force and leg contraction can vary between grippers. There are moral implications behind euthanizing spiders for robotics. The ethical boundaries that necrobotics push in the pursuit of biohybrid systems raise concerns, as opponents say it may lead to the hybridization of mammals and is intrusive to nature. Proponents respond that repurposing dead animals has been human practice for millennia and that necrobotics should be pursued to advance science.
Yooreeka
Yooreeka is a library for data mining, machine learning, soft computing, and mathematical analysis. The project started with the code of the book "Algorithms of the Intelligent Web". Although the term "Web" prevailed in the title, in essence, the algorithms are valuable in any software application. It covers all major algorithms and provides many examples. Yooreeka 2.x is licensed under the Apache License rather than the somewhat more restrictive LGPL (which was the license of v1.x). The library is written 100% in the Java language. == Algorithms == The following algorithms are covered: Clustering Hierarchical—Agglomerative (e.g. MST single link; ROCK) and Divisive Partitional (e.g. k-means) Classification Bayesian Decision trees Neural Networks Rule based (via Drools) Recommendations Collaborative filtering Content based Search PageRank DocRank Personalization