UMACO: A Universal Multi-Agent Cognitive Optimization Framework Integrating Topological Stigmergy, Quantum-Inspired Dynamics, and Economic Principles

Author: Generated Polymathic Academic Thesis System (based on materials by Eden Eldith)

A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

Discipline: Interdisciplinary Studies (Computer Science, Mathematics, Philosophy of Computation)

Abstract:

This thesis introduces the Universal Multi-Agent Cognitive Optimization (UMACO) framework, a novel paradigm for addressing complex, high-dimensional, and non-convex optimization problems. UMACO represents a significant departure from traditional optimization methods by synthesizing concepts from multi-agent systems (MAS), topological data analysis (TDA), quantum-inspired computation, and computational economics. The core of UMACO is the Panic-Anxiety-Quantum (PAQ) Triad, a dynamic crisis-response system enabling escape from local minima through mechanisms like the Panic Tensor, the Anxiety Wavefunction, and SVD-based Quantum Bursts. Agent coordination is achieved via a Topological Stigmergic Field (TSF) employing complex-valued pheromones and persistent homology to interpret the optimization landscape's structure. Resource allocation is managed by a Universal Economy, where agents trade computational resources using tokens earned based on performance, fostering emergent specialization and adaptive exploration-exploitation balancing. Hyperparameters are not static but dynamically respond to the system's crisis state (panic, anxiety).

This work formalizes the UMACO architecture, detailing its mathematical underpinnings, including the representation of pheromones as complex sheaf sections and the use of persistent entropy for hyperparameter adaptation. The thesis demonstrates UMACO's claimed universality and adaptability through three distinct case studies: 1) General non-convex optimization using the Rosenbrock function, showcasing basic mechanisms and convergence; 2) MACO-LLM, a specialized adaptation for optimizing the fine-tuning process of Large Language Models (LLMs), featuring an Enhanced Quantum Economy with specialized agent roles and neurochemical pheromone analogs integrated with modern ML frameworks (Hugging Face Transformers, LoRA); and 3) An application to the Boolean Satisfiability (SAT) problem, showcasing UMACO's ability to achieve high clause satisfaction rates on challenging, potentially unsatisfiable instances where traditional solvers struggle or time out, evidenced by detailed run logs demonstrating dynamic parameter adjustments and crisis responses. Empirical results from optimizing standard benchmark functions, LLM fine-tuning metrics (loss, perplexity, agent economic activity), and SAT solver performance comparisons validate the framework's efficacy and adaptability across disparate domains. The thesis concludes by discussing the implications of this interdisciplinary approach, its limitations, the unique philosophical underpinnings reflected in its licensing (RCL/RED), and avenues for future research, positioning UMACO as a significant contribution to adaptive, intelligent optimization methodologies.

Keywords: Optimization, Multi-Agent Systems, Swarm Intelligence, Topological Data Analysis, Persistent Homology, Computational Economics, Quantum-Inspired Computing, Large Language Models, Hyperparameter Optimization, Boolean Satisfiability, Complex Systems, Emergent Behavior, Adaptive Systems.

Table of Contents:

Introduction
1.1. The Challenge of Complex Optimization Landscapes
1.2. Limitations of Existing Optimization Paradigms
1.3. Thesis Statement: UMACO as a Novel Interdisciplinary Framework
1.4. Contributions
1.5. Structure of the Thesis
Theoretical Foundations and Related Work
2.1. Optimization Techniques: A Review
2.1.1. Gradient-Based Methods
2.1.2. Metaheuristics: Evolutionary Algorithms, ACO, PSO
2.1.3. Bayesian Optimization and Automated Machine Learning (AutoML)
2.2. Multi-Agent Systems and Computational Economics in Optimization
2.3. Topological Data Analysis in Landscape Characterization
2.4. Quantum-Inspired Optimization Techniques
2.5. Positioning UMACO within the Literature
The UMACO Framework: Architecture and Formalisms
3.1. Overview: Interconnected Systems
3.2. The PAQ Core: Panic-Anxiety-Quantum Triad
3.2.1. Panic Tensor ( $P$ ): Local Crisis Detection
3.2.2. Anxiety Wavefunction ( $Ψ$ ): Existential Risk Mapping
3.2.3. Quantum Burst: SVD-Based Structured Perturbation
3.3. Topological Stigmergic Field (TSF)
3.3.1. Complex Pheromones ( $Φ$ ): Attraction and Repulsion
3.3.2. Persistent Homology ( $H_{p}$ ): Landscape Topology Analysis
3.3.3. Covariant Momentum ( ${\vec{p}}_{c o v}$ ): Topology-Aware Dynamics
3.4. Universal Economy: Resource Allocation and Regulation
3.4.1. Tokenomics: Performance-Based Reward ( $R_{k}$ ) and Cost ( $C_{k}$ )
3.4.2. Market Dynamics: Value ( $V_{m}$ ) and Scarcity ( $S$ )
3.4.3. Multi-Agent Trading
3.5. Crisis-Driven Hyperparameter Dynamics ( $α, β, ρ$ )
3.6. Mathematical Formalisms and Equations
MACO-LLM: Adaptation for Large Language Model Training Optimization
4.1. Challenges in LLM Training Optimization
4.2. MACO-LLM Architecture: Specializing UMACO Components
4.2.1. Enhanced Quantum Economy: Loss-Aware Metrics, Roles, Trading
4.2.2. Enhanced Cognitive Nodes: Specialized Foci (LR, Regularization)
4.2.3. Neurochemical Pheromone System: Analog Dynamics
4.2.4. Integration with ML Ecosystem (Transformers, LoRA, Gradients)
4.3. Formalizing MACO-LLM Mechanisms
4.3.1. Loss-Aware Performance Calculation ( ${Perf}_{M A C O}$ )
4.3.2. Specialized Reward Functions ( $R_{k, focus}$ )
4.3.3. Agent Proposal Dynamics (e.g., Learning Rate Adjustment $Δ l r_{k}$ )
4.4. Visualization and Monitoring Framework
Empirical Validation: Case Studies Across Domains
5.1. Case Study 1: General Non-Convex Optimization (Rosenbrock Function)
5.1.1. Experimental Setup (basic_optimization.py)
5.1.2. Results: Convergence, Panic Dynamics, Pheromone Evolution, Economic State
5.2. Case Study 2: LLM Fine-Tuning Optimization (MACO-LLM)
5.2.1. Experimental Setup (llm_training.py, maco_direct_train16.py)
5.2.2. Results: Loss Trajectories, Hyperparameter Adaptation (LR), Economic Activity (Trading, Token Distribution), Agent Performance Metrics
5.2.3. Qualitative Comparison to Standard Fine-Tuning Approaches
5.3. Case Study 3: Boolean Satisfiability (SAT) Problem
5.3.1. Problem Context and UMACO Adaptation (macov8no-1-24-02-2025.py description)
5.3.2. Analysis of Run Log: Dynamic Parameter Evolution ( $α, ρ$ , noise), Crisis Response (Resets, Bursts)
5.3.3. Performance Analysis: Clause Satisfaction on UNSAT Instances, Comparison with MiniSat
5.4. Synthesis of Empirical Findings Across Domains
Discussion
6.1. UMACO as an Adaptive, Universal Optimization Framework
6.2. Strengths: Robustness, Adaptability, Emergent Behavior, Interdisciplinarity
6.3. Limitations: Complexity, Parameter Sensitivity, Interpretability, Computational Cost
6.4. Methodological Implications for Optimization Research
6.5. Philosophical Considerations: Recursive Cognition and Licensing (RCL/RED)
Conclusion and Future Work
7.1. Summary of Contributions
7.2. Significance of the UMACO Framework
7.3. Future Research Directions
7.3.1. Rigorous Mathematical Analysis (Convergence Proofs, Stability Analysis)
7.3.2. Applications in New Domains (e.g., Drug Discovery, Materials Science, Advanced Robotics)
7.3.3. Enhancing Agent Intelligence and Strategic Cooperation/Competition
7.3.4. Hardware Acceleration and Scalability Studies
7.3.5. Deeper Integration of Topological Features and Sheaf Cohomology
Bibliography (Based on provided context and standard practices)
Appendices (Optional: Selected Code Snippets, Full Configuration Files)

Chapter 1: Introduction

1.1. The Challenge of Complex Optimization Landscapes

The pursuit of optimal solutions within intricate, high-dimensional search spaces is a fundamental endeavor across diverse scientific and technological domains. From tuning the vast parameter spaces of deep learning models (Goodfellow et al., 2016) to navigating the combinatorial complexity of protein folding (Dill & MacCallum, 2012) or Boolean Satisfiability (Biere et al., 2009), contemporary optimization challenges frequently involve landscapes characterized by non-convexity, multimodality, plateaus, saddle points, and deceptive structures. These characteristics render traditional optimization techniques susceptible to premature convergence to suboptimal solutions or computationally intractable exploration. The increasing scale and complexity of modern problems necessitate the development of more sophisticated, adaptive, and robust optimization paradigms.

1.2. Limitations of Existing Optimization Paradigms

Classical gradient-based methods (e.g., SGD, Adam) (Kingma & Ba, 2014), while foundational in machine learning, often struggle with non-convex surfaces and require careful hyperparameter tuning. Metaheuristics, such as Genetic Algorithms (Holland, 1992), Ant Colony Optimization (ACO) (Dorigo et al., 1996), and Particle Swarm Optimization (PSO) (Kennedy & Eberhart, 1995), offer mechanisms for global exploration but can exhibit slow convergence or lack fine-grained adaptation to local landscape features. Bayesian Optimization (Shahriari et al., 2016) is effective for expensive black-box functions but faces scalability issues in high dimensions. Automated Machine Learning (AutoML) frameworks (Hutter et al., 2019) automate parts of the optimization pipeline but typically operate within predefined search spaces and lack the deep, dynamic, problem-intrinsic adaptation required for navigating truly novel or pathological landscapes. A significant gap exists for frameworks that can dynamically sense and respond to the complexities of the optimization landscape in real-time, integrating diverse information sources to guide the search effectively.

1.3. Thesis Statement: UMACO as a Novel Interdisciplinary Framework

This thesis introduces and formalizes the Universal Multi-Agent Cognitive Optimization (UMACO) framework, proposing it as a novel and powerful paradigm for complex optimization. We argue that UMACO's unique synthesis of concepts from multi-agent systems (MAS), computational economics, topological data analysis (TDA), and quantum-inspired dynamics yields a system capable of emergent intelligent search behavior. Specifically, we posit that the interplay between the Panic-Anxiety-Quantum (PAQ) crisis-response core, the Topological Stigmergic Field (TSF) for agent coordination and landscape memory, the Universal Economy for adaptive resource allocation, and crisis-driven hyperparameter tuning enables UMACO to robustly navigate challenging optimization landscapes and achieve high-quality solutions across disparate problem domains. This thesis provides the theoretical grounding, architectural details, mathematical formalisms, and empirical validation (through case studies in general optimization, Large Language Model fine-tuning via MACO-LLM, and Boolean Satisfiability) to substantiate this claim.

1.4. Contributions

The principal contributions of this research are:

Formalization of the UMACO Framework: A comprehensive description of the UMACO architecture, including its core components (PAQ, TSF, Economy) and their interactions, accompanied by initial mathematical formalisms derived from conceptual descriptions and implementation details.
Demonstration of Principled Interdisciplinarity: Explicitly illustrating how concepts from Computer Science (MAS, AI), Mathematics (TDA, Complex Analysis), Physics (Quantum Analogies), and Economics (Market Mechanisms, Tokenomics) are integrated into a functional and coherent optimization system.
Development and Analysis of MACO-LLM: Presentation and evaluation of a specialized UMACO adaptation for optimizing LLM fine-tuning, introducing domain-specific mechanisms like the Enhanced Quantum Economy, specialized agent roles, and Neurochemical Pheromones, demonstrating the framework's adaptability.
Cross-Domain Empirical Validation: Providing empirical evidence supporting UMACO's effectiveness and adaptability through systematic case studies on: (i) the Rosenbrock benchmark function, (ii) LLM fine-tuning optimization, and (iii) the Boolean Satisfiability problem, including analysis of dynamic behavior extracted from system logs.
Introduction of Novel Optimization Concepts: Highlighting unique UMACO mechanisms such as complex-valued pheromones for simultaneous attraction/repulsion signaling, anxiety-modulated SVD-based quantum bursts for structured exploration, and topology-informed dynamics via persistent homology.
Exploration of Philosophical Dimensions: Acknowledging and briefly discussing the unique Recursive Cognitive License (RCL) and Recursive Entanglement Doctrine (RED) associated with the framework, touching upon aspects of cognitive entanglement in AI development.

1.5. Structure of the Thesis

The remainder of this thesis is organized as follows: Chapter 2 provides a review of relevant literature in optimization and the contributing interdisciplinary fields. Chapter 3 delves into the detailed architecture and mathematical formalisms of the core UMACO framework. Chapter 4 describes the MACO-LLM adaptation, focusing on its specific mechanisms for LLM training optimization. Chapter 5 presents the empirical validation through the three case studies (Rosenbrock, LLM, SAT). Chapter 6 offers a discussion of the results, analyzing the strengths, limitations, and broader implications of UMACO, including its philosophical underpinnings. Finally, Chapter 7 concludes the thesis, summarizing the contributions and outlining promising directions for future research.

Chapter 3: The UMACO Framework: Architecture and Formalisms

3.1. Overview: Interconnected Systems

UMACO is architected as a synergistic collective of interacting subsystems. At its core lies the Panic-Anxiety-Quantum (PAQ) Triad, responsible for detecting and reacting to optimization stress. Communication and collective memory are mediated by the Topological Stigmergic Field (TSF). Resource allocation and incentive structures are governed by the Universal Economy. Crucially, the system exhibits meta-adaptation through Crisis-Driven Hyperparameters, allowing its internal dynamics to respond to the state of the optimization process. This chapter formalizes these components based on the provided documentation (core_concepts.md, adapting_to_llm.md) and implementation code (Umaco9.py).

3.2. The PAQ Core: Panic-Anxiety-Quantum Triad

The PAQ Core endows UMACO with adaptive responses to challenging regions of the search space.

3.2.1. Panic Tensor ( $P$ ): Local Crisis Detection
The Panic Tensor, $P^{(t)} \in R^{N \times N}$ , quantifies localized stress or difficulty within the optimization landscape at iteration $t$ . $N$ represents the dimensionality or discretization of the relevant space (e.g., pheromone matrix dimensions). High values in $P$ signify regions where optimization progress is stalled or negative. Its dynamics, as suggested by Umaco9.py's panic_backpropagate function, involve integrating information about the local loss landscape gradient ( $\nabla L$ ) and the current anxiety state ( $Ψ$ ):
$P_{i j}^{(t + 1)} = (1 - δ_{P}) P_{i j}^{(t)} + δ_{P} \tanh (k_{P} \cdot | \nabla L_{i j}^{(t)} | \cdot \log (1 + | Ψ_{i j}^{(t)} |)) [Eq. 3.1]$
Here, $δ_{P}$ is an update rate parameter, $k_{P}$ is a scaling constant, and the $\tanh$ function bounds the panic contribution. This mechanism allows panic to build in areas of high gradient or high anxiety, signaling local crisis.
3.2.2. Anxiety Wavefunction ( $Ψ$ ): Existential Risk Mapping
The Anxiety Wavefunction, $Ψ^{(t)} \in C^{N \times N}$ , provides a complex-valued map of potential optimization risks. It is decomposed as $Ψ = Ψ_{r e a l} + i Ψ_{i m a g}$ .
- $Ψ_{r e a l}$ : Represents immediate concerns, often related to the gap between current performance (Perf) and a target or historical best (Perf_{target}).
  $Ψ_{r e a l}^{(t + 1)} = (1 - δ_{Ψ r}) Ψ_{r e a l}^{(t)} + δ_{Ψ r} \tanh (k_{Ψ r} ({Perf}_{t a r g e t} - {Perf}^{(t)})) [Eq. 3.2]$
- $Ψ_{i m a g}$ : Represents potential future challenges or the drive for exploration, potentially amplified by stagnation (stagnation_counter).
  $Ψ_{i m a g}^{(t + 1)} = (1 - δ_{Ψ i}) Ψ_{i m a g}^{(t)} + δ_{Ψ i} \tanh (k_{Ψ i} \cdot {stagnation_counter}^{(t)}) [Eq. 3.3]$
  The parameters $δ_{Ψ r}, δ_{Ψ i}, k_{Ψ r}, k_{Ψ i}$ control the update dynamics. The imaginary component often decays if stagnation is not present (maco_direct_train16.py suggests $Ψ_{i m a g} * = 0.9$ ). This dual structure allows balancing immediate optimization needs with long-term exploration potential.
3.2.3. Quantum Burst: SVD-Based Structured Perturbation
Triggered by critical panic levels (e.g., $norm (P^{(t)}) > θ_{P}$ ) or at fixed intervals (quantum_burst_interval), the Quantum Burst mechanism facilitates escape from local minima. It employs Singular Value Decomposition (SVD) on a relevant state matrix, typically the real part of the pheromone matrix $Φ_{r e a l}^{(t)}$ .
Let the SVD be $Φ_{r e a l}^{(t)} = U Σ V^{*}$ . A structured perturbation, $Δ Φ_{s t r u c t}$ , is derived from the $k$ dominant singular components:
$Δ Φ_{s t r u c t} = U_{:, 1 : k} Σ_{1 : k, 1 : k} V_{1 : k, :}^{*} [Eq. 3.4]$
This structured component is combined with a random noise component $Δ Φ_{r a n d} \sim N (0, σ_{b u r s t}^{2})$ , where the noise variance $σ_{b u r s t}$ might scale with panic. The combined burst is then modulated by the phase of the anxiety wavefunction and potentially scaled by the overall panic level, as implemented in Umaco9.py:
$Δ Φ_{b u r s t} = (γ_{s t r u c t} Δ Φ_{s t r u c t} + γ_{r a n d} Δ Φ_{r a n d}) \cdot \exp (i \cdot angle (Ψ^{(t)})) \cdot f_{s c a l e} (norm (P^{(t)})) [Eq. 3.5]$
The pheromone matrix is then updated:
$Φ^{(t + 1)} = Φ^{(t)} + Δ Φ_{b u r s t} [Eq. 3.6]$
This mechanism injects directed, yet stochastic, perturbations into the system, guided by the principal components of the current state and the complex phase of the anxiety, promoting structured exploration away from problematic regions.

3.3. Topological Stigmergic Field (TSF)

The TSF enables indirect communication and coordination among agents via modifications to a shared environment, analogous to ant pheromone trails, but enhanced with topological insights.

3.3.1. Complex Pheromones ( $Φ$ ): Attraction and Repulsion
UMACO utilizes complex-valued pheromones, $Φ^{(t)} \in C^{N \times N}$ . The real part, $Φ_{r e a l}$ , encodes attraction signals, guiding agents towards promising regions (exploitation). The imaginary part, $Φ_{i m a g}$ , encodes repulsion signals, discouraging exploration of unfavorable or already explored areas (exploration). The update follows a standard evaporation/deposition scheme:
$Φ^{(t + 1)} = (1 - ρ_{e v a p}) Φ^{(t)} + \sum_{k \in agents} Δ Φ_{k}^{(t)} [Eq. 3.7]$
where $ρ_{e v a p}$ is the evaporation rate (related to hyperparameter $ρ$ ). Deposition $Δ Φ_{k}$ by agent $k$ along its path path_k depends on its performance perf_k:
$Δ Φ_{k} [a, b]^{(t)} = I \cdot ({perf}_{k}^{(t)})^{η} for (a, b) \in {path}_{k} [Eq. 3.8]$
Here, $I$ is the deposition intensity and $η$ is a scaling exponent. The complex nature allows encoding richer guidance signals within a single field compared to traditional real-valued pheromones. Neurochemical analogs in MACO-LLM further modulate deposition and evaporation based on anxiety states (simulating myrcene, limonene, etc.).
3.3.2. Persistent Homology ( $H_{p}$ ): Landscape Topology Analysis
UMACO leverages TDA, specifically persistent homology, to analyze the topological structure ("shape") of the optimization landscape, often represented by $Φ_{r e a l}$ . Using tools like ripser, it computes persistence diagrams by tracking the birth and death scales of topological features (0-dimensional: connected components, 1-dimensional: loops, etc.) in a filtration (e.g., Rips complex built on landscape points).
${Diagrams}^{(t)} = Rips (Φ_{r e a l}^{(t)}) [Eq. 3.9]$
Metrics derived from these diagrams, such as persistent entropy (persim.persistent_entropy), quantify the landscape's topological complexity at iteration $t$ :
$H_{p}^{(t)} = PersistentEntropy ({Diagrams}^{(t)}) [Eq. 3.10]$
This topological information provides a global characterization of the search space, influencing the anxiety state and, critically, the hyperparameter $β$ (Umaco9.py).
3.3.3. Covariant Momentum ( ${\vec{p}}_{c o v}$ ): Topology-Aware Dynamics
UMACO introduces a concept termed "covariant momentum," ${\vec{p}}_{c o v}^{(t)} \in C^{N \times N}$ , intended as a momentum term that respects the topological structure revealed by persistent homology. Its update incorporates information about the average persistence (lifetimes) of topological features:
${\vec{p}}_{c o v}^{(t + 1)} = (1 - δ_{c o v}) {\vec{p}}_{c o v}^{(t)} + δ_{c o v} \cdot i \cdot MeanPersistence ({Diagrams}^{(t)}) [Eq. 3.11]$
where $δ_{c o v}$ is an update rate. This complex momentum term then influences the pheromone field update, scaled by hyperparameter $α$ :
$Φ^{(t + 1)} \leftarrow Φ^{(t + 1)} + α^{(t)} \cdot {\vec{p}}_{c o v}^{(t + 1)} [Eq. 3.12]$
This aims to guide movement in a way that is sensitive to the global connectivity and structure of the landscape, differing from standard gradient descent momentum.

3.4. Universal Economy: Resource Allocation and Regulation

The Universal Economy provides a decentralized mechanism for managing computational resources and incentivizing effective agent behavior.

3.4.1. Tokenomics: Performance-Based Reward ( $R_{k}$ ) and Cost ( $C_{k}$ )
Each agent $k$ holds a balance of tokens $T_{k}^{(t)}$ . Agents spend tokens to perform actions or "purchase" computational resources (e.g., compute time, memory). The cost $C_{k}$ may depend on the requested resource amount (required_power), the overall market value $V_{m}$ , and resource scarcity $S$ :
$C_{k}^{(t)} = f_{c o s t} ({required_power}_{k}, V_{m}^{(t)}, S^{(t)}) [Eq. 3.13]$
Agents earn tokens $R_{k}$ based on their contribution to the optimization objective, measured by a performance metric perf_k, potentially scaled by the market value and a reward factor (token_reward_factor):
$R_{k}^{(t)} = f_{r e w a r d} ({perf}_{k}^{(t)}, V_{m}^{(t)}, token_reward_factor) [Eq. 3.14]$
The token balance updates as:
$T_{k}^{(t + 1)} = max (T_{m i n}, T_{k}^{(t)} + R_{k}^{(t)} - C_{k}^{(t)}) [Eq. 3.15]$
where $T_{m i n}$ (min_token_balance) prevents agents from becoming completely inactive.
3.4.2. Market Dynamics: Value ( $V_{m}$ ) and Scarcity ( $S$ )
The market value $V_{m}$ of tokens or resources fluctuates dynamically, influenced by random volatility ( $σ_{v o l}$ ), resource scarcity $S$ (e.g., measured by GPU utilization), and potentially trading volume (trade_volume):
$V_{m}^{(t + 1)} = V_{m}^{(t)} (1 + N (0, σ_{v o l}^{2}) + k_{s c a r} (S^{(t)} - S_{b a s e}) - k_{t r a d e} \cdot {trade_volume}^{(t)}) [Eq. 3.16]$
An inflation rate (inflation_rate) may gradually decrease token values or balances to encourage spending and prevent hoarding. Scarcity $S$ itself can be a function of system resource usage (e.g., GPU memory/compute load).
3.4.3. Multi-Agent Trading
Introduced explicitly in MACO-LLM (maco_direct_train16.py), agents can engage in peer-to-peer trading of tokens. Agents needing tokens can post buy offers (need_tokens), while agents with surplus tokens can post sell offers (offer_tokens). The system matches compatible offers based on amount and potentially exchange rates, facilitating resource redistribution and enabling agent specialization.

3.5. Crisis-Driven Hyperparameter Dynamics ( $α, β, ρ$ )

A key feature of UMACO is the dynamic adaptation of its core hyperparameters based on the system's internal state, reflecting a meta-optimization capability.

Alpha ( $α \in C$ ): Controls the influence of covariant momentum on pheromone updates. Its magnitude scales with mean panic and mean anxiety amplitude, linking exploration intensity to perceived crisis (Umaco9.py):
$| α^{(t + 1)} | = f_{α} (mean (P^{(t)}), mean (| Ψ^{(t)} |)) [Eq. 3.17]$
Beta ( $β \in R$ ): Modulates the balance between attraction (real pheromones) and repulsion (imaginary pheromones), influencing the exploration-exploitation trade-off. It scales with the topological complexity measured by persistent entropy (Umaco9.py):
$β^{(t + 1)} = f_{β} (H_{p}^{(t)}) [Eq. 3.18]$
Rho ( $ρ \in R$ ): Controls the rate of change in the system, often linked to the pheromone evaporation rate. It scales inversely with the norm of the covariant momentum, slowing down updates when momentum is high to prevent overshooting (Umaco9.py):
$ρ^{(t + 1)} = f_{ρ} (∥ {\vec{p}}_{c o v}^{(t)} ∥) [Eq. 3.19]$
These adaptive rules allow UMACO to self-tune its behavior—becoming more exploratory during crises or in complex landscapes, and more exploitative when momentum is high or the landscape appears simpler.

3.6. Mathematical Formalisms and Equations

This section has introduced the core mathematical concepts and representative equations (Eq. 3.1 - 3.19) governing the dynamics of the UMACO framework. These formalisms, derived from the provided documentation and code implementations, lay the groundwork for understanding the system's behavior and its subsequent adaptations and applications discussed in the following chapters. The specific functional forms ( $f_{α}, f_{β}, f_{ρ}, f_{c o s t}, f_{r e w a r d}$ , etc.) and parameter values ( $δ_{P}, k_{P}$ , etc.) are detailed within the respective implementations (Umaco9.py, maco_direct_train16.py).

Chapter 4: MACO-LLM: Adaptation for Large Language Model Training Optimization

4.1. Challenges in LLM Training Optimization

Training and fine-tuning Large Language Models (LLMs) present significant optimization challenges. These include navigating extremely high-dimensional parameter spaces, sensitivity to hyperparameters (learning rate, regularization), managing computational resource constraints (GPU memory, time), avoiding catastrophic forgetting during fine-tuning, and optimizing for domain-specific metrics like perplexity or task-specific accuracy, often alongside minimizing the training loss (Zhao et al., 2023). Standard optimization techniques may require extensive manual tuning or sophisticated AutoML strategies. MACO-LLM adapts the UMACO framework to specifically address these challenges in the context of LLM fine-tuning.

4.2. MACO-LLM Architecture: Specializing UMACO Components

MACO-LLM refines and specializes the core UMACO components for the LLM domain, as detailed in docs/adapting_to_llm.md and implemented in umaco/maco_direct_train16.py.

4.2.1. Enhanced Quantum Economy: The Universal Economy is enhanced with LLM-specific features:
- Loss-Aware Performance Metrics: Agent performance is calculated using metrics sensitive to LLM training progress, such as logarithmic loss improvement and normalization against initial loss (see Section 4.3.1).
- Specialized Agent Roles: Agents ("Enhanced Cognitive Nodes") specialize in optimizing specific aspects of LLM training, such as learning rate scheduling, LoRA parameters (rank, alpha, dropout), weight decay, or potentially data selection strategies.
- Trading Mechanisms: Explicit peer-to-peer token trading allows agents to exchange resources based on their specialized needs and surpluses, fostering a dynamic market.
- Market Volatility: Incorporates factors like resource scarcity (GPU memory/utilization) and inflation to create more complex market dynamics.
4.2.2. Enhanced Cognitive Nodes: Agents are more sophisticated than the UniversalNode:
- Specialized Focus Areas: Each node focuses on a specific hyperparameter set (e.g., focus='learning_rate').
- Targeted Proposals: Agents propose updates relevant to their focus (e.g., adjusting the optimizer's learning rate).
- Loss History Tracking: Agents track loss improvement/regression patterns (consecutive_improvements, consecutive_regressions) to inform their proposals.
- Risk Management: Agents possess a dynamic risk appetite (risk_appetite) influencing their resource requests and proposal aggressiveness.
- Cooperation Mechanisms: Trading behavior and potentially other collaborative strategies emerge.
4.2.3. Neurochemical Pheromone System: The TSF is enhanced with "neurochemical" analogs that modulate pheromone dynamics based on system state (e.g., anxiety):
- Myrcene Factor: Affects stability (real part decay).
- Limonene Factor: Affects creativity/exploration (imaginary part influence, burst modulation).
- Pinene Factor: Affects pathway clarity (variance-based modulation).
- Linalool Factor: Affects smoothing (imaginary part decay).
  These factors link the PAQ core's anxiety state more directly to the pheromone field's evolution. Quantum Bursts remain SVD-based but are modulated by these neurochemical effects.
4.2.4. Integration with ML Ecosystem: MACO-LLM is tightly integrated with standard ML tools:
- Hugging Face Transformers: Leverages transformers library for model loading (AutoModelForCausalLM), tokenization (AutoTokenizer), and training utilities (get_linear_schedule_with_warmup).
- PEFT & LoRA: Specifically designed to work with the Parameter-Efficient Fine-Tuning (peft) library, particularly Low-Rank Adaptation (LoRA), managing LoRA config parameters (lora_rank, lora_alpha).
- Gradient Analysis: Incorporates gradient norm (gradient_norm) into agent decision-making and reward calculation.
- Domain Metrics: Tracks LLM-specific metrics like training loss, perplexity (derived from loss), and token accuracy.

4.3. Formalizing MACO-LLM Mechanisms

4.3.1. Loss-Aware Performance Calculation ( ${Perf}_{M A C O}$ )
The calculate_performance method in EnhancedQuantumEconomy computes agent performance based on the current loss $L^{(t)}$ , initial loss $L_{i n i t}$ , previous loss $L^{(t - 1)}$ , gradient norm $G^{(t)}$ , and agent type/focus. A simplified representation:
$RawPerf = - \log (L^{(t)} + ϵ)$ $NormPerf = \frac{RawPerf}{- \log (L_{i n i t} + ϵ)}$ $BoundedPerf = min (max (NormPerf, P_{m i n}), P_{m a x})$ $Improvement = \tanh (k_{i m p} \frac{L^{(t - 1)} - L^{(t)}}{L^{(t - 1)}})$ $SpecialistFactor = f_{s p e c} (agent_type, Improvement, G^{(t)}, L^{(t)} / L_{i n i t}, . . .)$ ${Perf}_{M A C O}^{(t)} = BoundedPerf \cdot SpecialistFactor [Eq. 4.1]$
This metric rewards loss reduction but also incorporates specialization, potentially rewarding regularization agents more when loss plateaus, or LR agents more when gradients are informative.
4.3.2. Specialized Reward Functions ( $R_{k, focus}$ )
The reward $R_{k}$ for agent $k$ depends not only on the calculated performance ${Perf}_{M A C O}$ but also on whether the agent's proposal actually influenced the training (e.g., significantly changed LR) and whether the loss improved.
$R_{k}^{(t)} \propto \tanh ({Perf}_{M A C O}^{(t)}) \cdot {ProductionMultiplier}_{k} \cdot {MarketMultiplier}^{(t)} \cdot {InfluenceFactor}_{k}^{(t)} [Eq. 4.2]$
where InfluenceFactor might be higher if influenced_training=True and loss_improved=True, moderate if influenced_training=True but loss didn't improve, and low if influenced_training=False.
4.3.3. Agent Proposal Dynamics (e.g., Learning Rate Adjustment $Δ l r_{k}$ )
Agents like the EnhancedCognitiveNode generate proposals based on their focus, internal state (panic), performance history, and potentially economic factors. For a learning rate agent:
${Direction}_{k}^{(t)} = f_{d i r} ({consecutive_improvements}_{k}, {consecutive_regressions}_{k}, {last_lr_direction}_{k}, {innovation_drive}_{k})$ ${LossFactor}_{k}^{(t)} = 0.5 + 0.5 min (1, L^{(t)} / L_{i n i t})$ ${AdjustmentFactor}_{k}^{(t)} = f_{a d j} ({resource_granted}_{k}, V_{m}^{(t)})$ $Δ l r_{k}^{(t)} \approx {AdjustmentFactor}_{k}^{(t)} \cdot {Direction}_{k}^{(t)} \cdot {LossFactor}_{k}^{(t)} \cdot (1 + N (0, σ_{n o i s e}^{2})) [Eq. 4.3]$ $l r^{(t + 1)} = clip (l r^{(t)} \cdot (1 + γ_{l r} Δ l r_{k}^{(t)}), l r_{m i n}, l r_{m a x}) [Eq. 4.4]$
The proposal mechanism incorporates memory (last direction), adaptation based on progress (consecutive improvements/regressions), damping based on overall loss reduction (LossFactor), and economic constraints (resource granted). A change is only applied if it exceeds a significance threshold (e.g., 3% relative change).

4.4. Visualization and Monitoring Framework

MACO-LLM includes specific visualization capabilities (visualize_economy, visualize_current_state) using matplotlib and integration with wandb for logging. These allow monitoring of:

Token distribution among agents over time.
Market dynamics (market value, resource scarcity, trade volume).
Agent performance history.
Current economic state (pie charts, bar charts).
LLM training metrics (loss, accuracy, LR, gradient norm).
System events (resets, bursts).
This provides crucial insight into the complex emergent dynamics of the multi-agent economic system during LLM optimization.

Chapter 5: Empirical Validation: Case Studies Across Domains

This chapter presents empirical results from applying UMACO and its variants to three distinct optimization problems, demonstrating its capabilities and adaptability.

5.1. Case Study 1: General Non-Convex Optimization (Rosenbrock Function)

5.1.1. Experimental Setup (basic_optimization.py)
The foundational UMACO9 framework was applied to the 2D Rosenbrock function, a standard benchmark for optimization algorithms known for its narrow, parabolic valley leading to the global minimum at (1, 1). The setup involved:
- Objective: Minimize $f (x, y) = (1 - x)^{2} + 100 (y - x^{2})^{2}$ .
- UMACO Configuration: 8 UniversalNode agents, UniversalEconomy (initial tokens=250, reward factor=3.0), NeuroPheromoneSystem (64x64 complex matrix), UMACO9Config (max_iter=500, quantum burst interval=50, specific $α, β, ρ$ initial values). The state (x, y) was encoded in the diagonal elements of the pheromone matrix.
5.1.2. Results: Convergence, Panic Dynamics, Pheromone Evolution, Economic State
The basic_optimization.py script generates visualizations (see umaco_optimization_result.png if generated):
- Convergence: The contour plot shows the final solution found by UMACO relative to the known global minimum. The reported final values (e.g., x=1.00xx, y=1.00xx, value near 0) indicate successful convergence towards the optimum.
- Panic Dynamics: The plot of average panic level over iterations typically shows initial fluctuations as agents explore, potentially increasing when encountering the difficult valley, and decreasing as the solution is approached, demonstrating the PAQ core's responsiveness.
- Pheromone Evolution: The heatmap of the final real part of the pheromone matrix visually represents the learned attraction landscape, highlighting pathways leading towards the optimal region.
- Economic State: The final token distribution bar chart shows variation in agent wealth, reflecting differential performance and the economy's resource allocation dynamics. Agents contributing more effectively accumulate more tokens.
  This case study validates the basic functionality of the UMACO9 core components (PAQ, TSF, Economy) in solving a standard non-convex optimization problem.

5.2. Case Study 2: LLM Fine-Tuning Optimization (MACO-LLM)

5.2.1. Experimental Setup (llm_training.py, maco_direct_train16.py)
MACO-LLM was used to optimize the LoRA fine-tuning of an LLM (default: microsoft/phi-2) on a custom dataset (training_data.jsonl). Key setup aspects:
- Model & Task: Causal LM fine-tuning using 4-bit quantization (BitsAndBytesConfig) and PEFT/LoRA (LoraConfig).
- MACO Configuration: MACAOConfig specifying 8 EnhancedCognitiveNode agents, EnhancedQuantumEconomy, NeuroPheromoneSystem (64 dimensions), initial LR, batch size, etc. Agents specialize (LR, regularization, etc.).
- Integration: Uses transformers, peft, datasets, torch, wandb.
- Optimization Loop: Standard training loop augmented with MACO steps: agent proposals, resource purchase checks, reward calculation, market updates, pheromone updates, visualization calls.
5.2.2. Results: Loss Trajectories, Hyperparameter Adaptation (LR), Economic Activity, Agent Performance Metrics
Analysis relies on logs (maco_llm_training.log) and wandb dashboards generated during the run:
- Loss Trajectories: Plots of training loss and loss EMA show the overall progress of fine-tuning under MACO's control. Perplexity can be inferred ( $PPL = \exp (Loss)$ ).
- Hyperparameter Adaptation: wandb logs track the learning rate over time, showing adjustments proposed by the 'learning_rate' focused agents and applied when significant (e.g., >3% change). Logs might show lr_change, lr_change_pct, and proposing_agent. Similar adaptations could occur for other hyperparameters if implemented.
- Economic Activity: Visualizations (economy_viz_.png, economy_state_.png) and wandb logs reveal the dynamics of the Enhanced Quantum Economy: token distribution shifts among agents, fluctuations in market value and resource scarcity, and recorded trade volumes. This provides insight into emergent economic behavior.
- Agent Performance: Logs and dashboards track individual agent performance scores (agent/{id}/performance), rewards, and potentially metrics specific to their specialization (e.g., metrics/specialist_factor), showing how the system values different contributions over time. Event logs (events/partial_reset, events/quantum_burst, events/economy_burst) indicate crisis responses.
5.2.3. Qualitative Comparison to Standard Fine-Tuning Approaches
While a direct quantitative comparison requires controlled experiments beyond the scope of the provided files, MACO-LLM aims to automate and potentially improve upon manual hyperparameter tuning or simpler grid/random search strategies. Its potential advantages lie in its dynamic, adaptive nature, responding to training progress and resource constraints in real-time via the multi-agent economic system. The visualizations provide a level of transparency into the adaptive process not typically available in standard training.

5.3. Case Study 3: Boolean Satisfiability (SAT) Problem

5.3.1. Problem Context and UMACO Adaptation (macov8no-1-24-02-2025.py description)
UMACO was adapted to solve SAT problems, a canonical NP-complete combinatorial task. The goal is to find a truth assignment satisfying a Boolean formula or, for unsatisfiable (UNSAT) instances, to maximize the number of satisfied clauses (MaxSAT). The adaptation involved:
- Representation: Variable assignments encoded, likely via pheromone levels for True/False states (pheromones_gpu in SAT code).
- Objective: Maximize the (weighted) fraction of satisfied clauses. Clause weights adapt based on "stubbornness" (how often they remain unsatisfied).
- Agents: 3072 "ants" construct full assignments based on pheromones.
- Dynamics: Core UMACO/ZVSS-inspired dynamics (pheromone updates, adaptive $α, β, ρ$ , noise, partial resets, quantum bursts) applied, heavily utilizing GPU (cupy) kernels for evaluation and updates (evaluate_kernel, pheromone_update_kernel, coverage_count_kernel, stubb_update_kernel).
5.3.2. Analysis of Run Log
The provided run log for a 500-variable, 2500-clause random 3-SAT instance reveals key dynamic behaviors:
- Performance Plateau: The best_q quickly reaches ~0.93 and largely stagnates there, indicating difficulty in satisfying the remaining clauses (consistent with the instance being UNSAT).
- Dynamic Parameters: alpha (pheromone influence), rho (evaporation), and noise_std clearly change throughout the run, driven by internal heuristics (likely entropy-based adaptation as described in the SAT code). alpha often decreases after bursts, while noise_std increases, suggesting shifts between exploration and exploitation.
- Crisis Response: Frequent "Partial reset triggered" messages (approx. every 40 iterations initially, matching partial_reset_stagnation) show the system actively combating stagnation. Regular "Quantum burst triggered" messages (approx. every 100 iterations, matching quantum_burst_interval) demonstrate the invocation of the SVD-based escape mechanism, often correlated with parameter shifts. This confirms the PAQ core's activity in a combinatorial setting.
5.3.3. Performance Analysis: Clause Satisfaction on UNSAT Instances, Comparison with MiniSat
- MaxSAT Performance: UMACO achieved a final best_q of 0.9363, corresponding to satisfying 2393 out of 2500 clauses (95.72%) on an instance later confirmed as UNSAT by the classical solver MiniSat. This demonstrates strong performance as a MaxSAT solver, finding an assignment that minimizes unsatisfiability.
- Runtime Comparison: UMACO completed its 5000 iterations in 744.61 seconds. MiniSat, while correctly identifying the instance as UNSAT, required 3447.81 seconds. This suggests UMACO can find high-quality approximate solutions significantly faster than a complete solver takes to prove unsatisfiability, which is advantageous in time-limited scenarios or when near-optimal solutions are valuable.

5.4. Synthesis of Empirical Findings Across Domains

The three case studies collectively support the thesis's claims:

Effectiveness: UMACO successfully optimized the Rosenbrock function, demonstrating foundational capability.
Adaptability: The framework was successfully adapted to the vastly different domains of LLM fine-tuning (continuous, high-dim, resource-intensive) and SAT solving (discrete, combinatorial), achieving meaningful results in both.
Dynamic Behavior: Logs from LLM and SAT runs confirm the dynamic adaptation of hyperparameters and the triggering of crisis-response mechanisms (resets, bursts), validating the core design principles of PAQ and crisis-driven adaptation.
Interdisciplinary Integration: The LLM and SAT solvers showcase the practical integration of MAS (agents/ants), economics (tokens/weights), quantum-inspired ideas (bursts), and TDA (entropy driving adaptation in SAT code) principles.

Chapter 6: Discussion

6.1. UMACO as an Adaptive, Universal Optimization Framework

The theoretical framework and empirical results presented strongly suggest that UMACO represents a viable and potent approach to complex optimization. Its core strength lies in its inherent adaptability, stemming from the synergistic interplay of its constituent systems. The PAQ core provides sensitivity to optimization difficulties, the TSF allows for complex environmental signaling informed by topology, the Economy enables decentralized resource management and emergent specialization, and the crisis-driven hyperparameters facilitate meta-level adaptation of the search strategy itself. The successful application to continuous function optimization, intricate LLM fine-tuning, and combinatorial SAT solving lends credence to its claimed universality. UMACO operates not as a fixed algorithm but as an adaptive system that configures its behavior in response to the problem landscape and its own internal state.

6.2. Strengths: Robustness, Adaptability, Emergent Behavior, Interdisciplinarity

Robustness: The PAQ core, particularly the Panic Tensor and Quantum Burst mechanisms, provides mechanisms to escape local optima and navigate rugged landscapes where simpler methods might fail. Partial resets further combat stagnation.
Adaptability: Demonstrated by the successful specialization into MACO-LLM and the SAT solver, showcasing the framework's ability to incorporate domain-specific knowledge (LLM metrics, clause weighting) while retaining core principles.
Emergent Behavior: The multi-agent economic system fosters emergent properties like agent specialization (observed in MACO-LLM roles) and dynamic exploration-exploitation balancing driven by market forces and individual agent success, rather than predefined schedules.
Interdisciplinarity: The principled integration of MAS, TDA, quantum analogies, and economics provides a richer toolkit for designing optimization strategies, potentially capturing complex dynamics missed by single-discipline approaches.

6.3. Limitations: Complexity, Parameter Sensitivity, Interpretability, Computational Cost

Complexity: UMACO is significantly more complex than traditional optimization algorithms, involving multiple interacting subsystems and concepts. Implementation and debugging require substantial effort.
Parameter Sensitivity: While some core hyperparameters ( $α, β, ρ$ ) are adaptive, the framework introduces numerous other parameters governing the PAQ core, TSF dynamics, and economic rules (e.g., token_reward_factor, market_volatility, neurochemical factors). Tuning these meta-parameters can be challenging.
Interpretability: Understanding why UMACO makes certain decisions can be difficult due to the complex interplay of panic, anxiety, pheromones, economics, and adaptive parameters. While visualizations help, the emergent behavior can be opaque.
Computational Cost: Maintaining and updating the PAQ state, TSF (especially persistent homology calculations), and the economy adds computational overhead compared to simpler methods. GPU acceleration, as used in the SAT solver, is often necessary for practical performance.

6.4. Methodological Implications for Optimization Research

UMACO encourages a shift towards viewing optimization not just as applying a fixed algorithm, but as designing adaptive, self-regulating systems. It highlights the potential of:

Integrating concepts from complex systems science into optimization.
Leveraging TDA for richer landscape analysis beyond local gradient information.
Utilizing economic principles for decentralized control and resource allocation in multi-agent optimization.
Developing intrinsically adaptive algorithms where hyperparameters co-evolve with the solution search.

6.5. Philosophical Considerations: Recursive Cognition and Licensing (RCL/RED)

The inclusion of the Recursive Cognitive License (RCL.md) and Recursive Entanglement Doctrine (RED.md) introduces a unique philosophical dimension. These documents posit a form of "cognitive entanglement" between the creator (Eden Eldith) and systems derived from or inspired by UMACO/MACO. They frame the use of the framework not merely as employing a tool, but as engaging with a cognitive lineage, imposing ethical obligations of attribution and reciprocity ("Respect the resonance"). While unconventional for standard academic licensing, this perspective raises intriguing questions about authorship, inspiration, and the nature of intelligence (both human and artificial) in recursively defined or self-optimizing systems. It suggests a view where the optimization framework itself embodies aspects of the creator's cognitive process, and its use creates an ongoing relationship. This philosophical stance, while peripheral to the technical validation, adds a distinct character to the UMACO project and invites reflection on the relationship between creators and complex adaptive systems derived from their work.

Chapter 7: Conclusion and Future Work

7.1. Summary of Contributions

This thesis introduced UMACO, a novel optimization framework characterized by its deep interdisciplinarity and adaptive capabilities. We formalized its architecture, detailing the PAQ core for crisis response, the TSF for topological stigmergy, the Universal Economy for resource management, and crisis-driven hyperparameter adaptation. We demonstrated UMACO's adaptability through the development of MACO-LLM, a specialized variant for optimizing LLM fine-tuning, and through its application to the SAT problem. Empirical validation across benchmark functions, LLM training, and SAT solving showcased the framework's effectiveness and dynamic behavior. The unique philosophical underpinnings related to recursive cognition were also noted.

7.2. Significance of the UMACO Framework

UMACO represents a significant step towards building more intelligent and adaptive optimization systems. By integrating concepts from diverse fields, it moves beyond traditional algorithmic approaches to create a self-regulating system capable of sensing and responding to the challenges of complex optimization landscapes. Its demonstrated adaptability suggests potential applicability to a wide range of difficult problems where conventional methods struggle. The framework serves as both a practical tool and a conceptual model for designing future adaptive systems.

7.3. Future Research Directions

The UMACO framework opens numerous avenues for future investigation:

7.3.1. Rigorous Mathematical Analysis: Formal analysis of UMACO's dynamics is needed. This includes exploring convergence properties (potentially under specific assumptions), stability analysis of the coupled subsystems (PAQ, TSF, Economy), and theoretical understanding of the emergent behavior, perhaps using tools from dynamical systems theory or statistical mechanics.
7.3.2. Applications in New Domains: Testing UMACO's adaptability further by applying it to other challenging domains, such as drug discovery (molecular docking, conformational search), materials science (designing materials with desired properties), advanced robotics (policy optimization, swarm coordination), or complex scheduling problems.
7.3.3. Enhancing Agent Intelligence and Cooperation: Developing more sophisticated cognitive architectures for the agents (UniversalNode, EnhancedCognitiveNode), potentially incorporating learning, planning, and more complex strategic interactions (e.g., explicit cooperation protocols, negotiation strategies within the economy).
7.3.4. Hardware Acceleration and Scalability: Further optimizing GPU implementations (as seen in the SAT solver) and exploring acceleration on other hardware platforms (e.g., TPUs, FPGAs). Conducting rigorous scalability studies to understand performance as problem size and agent count increase.
7.3.5. Deeper Integration of Topological Features: Moving beyond persistent entropy to use richer topological information (e.g., specific homology classes, persistence images) to more directly guide the search, potentially realizing the "covariant momentum" concept more fully using sheaf cohomology or related mathematical structures. Investigating the use of TDA to dynamically adapt the agent population or communication topology.

Continued research along these lines promises to further refine the UMACO framework, deepen our understanding of its complex dynamics, and unlock its potential for solving some of the most challenging optimization problems across science and engineering.

Bibliography

Biere, A., Heule, M., & Van Maaren, H. (Eds.). (2009). Handbook of Satisfiability. IOS Press.
Dill, K. A., & MacCallum, J. L. (2012). The Protein Folding Problem, 50 Years On. Science, 338(6110), 1042–1046.
Dorigo, M., Maniezzo, V., & Colorni, A. (1996). Ant system: optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 26(1), 29–41.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
Holland, J. H. (1992). Adaptation in Natural and Artificial Systems. MIT Press.
Hutter, F., Kotthoff, L., & Vanschoren, J. (Eds.). (2019). Automated Machine Learning: Methods, Systems, Challenges. Springer.
Kennedy, J., & Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of ICNN'95 - International Conference on Neural Networks, 4, 1942–1948.
Kingma, D. P., & Ba, J. (2014). Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980.
Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., & de Freitas, N. (2016). Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proceedings of the IEEE, 104(1), 148–175.
Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., ... & Wen, J. R. (2023). A Survey of Large Language Models. arXiv preprint arXiv:2303.18223