Software Module Clustering

Software Module Clustering

Table of Contents

Software Module Clustering

Problem Description

The Software Module Clustering Problem (SMCP) involves organizing the code of software systems into different folders or packages. The objective is to optimize the modularity of the code, maximizing cohesion and minimizing coupling. This, in turn, enhances maintainability, scalability, and understandability of software systems. The SMCP is central to software engineering, particularly in the context of software maintenance, comprehension, and reengineering.


Industrial Context

In industry, large-scale software systems often evolve over time, leading to architectural erosion and tangled dependencies. SMCP provides a systematic approach to restructure such systems by identifying logical groupings of modules. This is especially valuable in legacy systems, microservices migration, and continuous integration environments. Companies aiming to refactor monolithic applications or improve software quality metrics benefit significantly from effective module clustering.


Common Challenges

There are three main challenges that arise in the SMCP:

  • Scalability: the SMCP is an NP-hard problem. The number of possible solutions grows exponentially with the number of modules. Finding the best solution is challenging in an acceptable time.
  • Modelling: there are different ways of modelling a software system to find a good organization. Modelling entities and dependencies within the code is not trivial.
  • Evaluation: there are different metrics to evaluate the quality of solutions. Moreover, some of those metrics are in conflict -improving one often means worsening another one- and therefore a trade-off must be found. The desirable trade-off is subjective and depends on the decision maker.

Solution Approaches

To solve the challenge of evaluating solutions, the SMCP can be tackled as a multi-objective optimization problem, where several objectives are considered and the preferences of the decision maker are taken into account. To address the challenge of scalability, approximate methods are usually chosen to solve the problem, which are able to find high-quality solutions in short computing times. In particular, most recent proposals are based on the Variable Neighborhood Search framework.


References

  • Yuste, J., Pardo, E. G., Duarte, A., & Hao, J. K. (2024). Multi-objective general variable neighborhood search for software maintainability optimization. Engineering Applications of Artificial Intelligence, 133, 108593.
  • Yuste, J., Pardo, E. G., & Duarte, A. (2024). General Variable Neighborhood Search for the optimization of software quality. Computers & Operations Research, 165, 106584.
  • Yuste, J., Pardo, E. G., & Duarte, A. (2022, October). Multi-objective variable neighborhood search for improving software modularity. In International Conference on Variable Neighborhood Search (pp. 58-68). Cham: Springer Nature Switzerland.
  • Yuste, J., Duarte, A., & Pardo, E. G. (2022). An efficient heuristic algorithm for software module clustering optimization. Journal of Systems and Software, 190, 111349.
  • Yuste, J., Pardo, E. G., & Duarte, A. (2022, July). Variable neighborhood descent for software quality optimization. In Metaheuristics International Conference (pp. 531-536). Cham: Springer International Publishing.

Acknowledgments

This research has been partially supported by: grants PGC2018-095322-B-C22, PID2021125709OA-C22, PID2021-126605NB-I00, funded by MCIN/AEI/10.13039/501100011033, Spain and by ‘‘ERDF A way of making Europe’’; grant P2018/TCS-4566, funded by the Comunidad de Madrid, Spain and cofinanced by the European Structural Funds ESF and FEDER, Spain; grant CIAICO/2021/224 funded by Generalitat Valenciana, Spain; grant M2988 funded by ‘‘Proyectos Impulso de la Universidad Rey Juan Carlos 2022, Spain’’; ‘‘Cátedra de Innovación y Digitalización Empresarial entre Universidad Rey Juan Carlos y Second Episode, Spain’’ (Ref. ID MCA06); and ‘‘Red Española de optimización heurística 4.0 digitalización, Spain’’ (Ref. RED2022-134480-T). The opinions, findings, and conclusions or recommendations expressed are those of the authors and do not necessarily reflect those of any of the funders.