Release 2.0.4

This release brings a new packaging with RPM packages for dependencies and system files, and a Python wheel for the Python code. There are also new Prometheus metrics and a few bug fixes like the source logging set incorrectly.

Issues fixed in this release

Bugs fixed:

  • DE 670: changes to fix source logging (@goodenou)

  • DE 674: Upgrading isort version to fix pre-config install error w/ poetry (@mambelli)

  • DE 683: Fix unit tests (@vitodb)

  • DE 696: Re-enable flake8 linter (@vitodb)

  • DE 701: Fixed bugs in metrics with labels (@IlyaBaburashvili)

  • DE 700: For flake8 skip build folder (@vitodb)

  • DE 713: Set /metrics Content-type header to text/plain (@shreyb)

  • DE 710: add invocation in the child processes for sources/channels logging (@namrathaurs)

Enhancements:

  • DE 668: Introduce publisher status data product (@knoepfel)

  • DE 671: explicitly select a container with all our required python versions (@jcpunk)

  • DE 676: Added EL9 instructions (@mambelli)

  • DE 678: Updated and tested instructions, PIP installation working (@mambelli)

  • DE 679: Disabling unit tests for python 3.6 (@vitodb)

  • DE 680: Make default database init use stronger data protections (@jcpunk)

  • DE 681: Update to pytest 7 with pytest-postgresql 5 (@jcpunk)

  • DE 682: Updated EL9 instruction: PIP installation and GWMS config tested (@mambelli)

  • DE 685: Enable some tests only on DE 1.7 branch (@vitodb)

  • DE 687: Update GH actions (@vitodb)

  • DE 688: Added two metrics on the Source - de_source_status and de_source_acquire_seconds (@skylerfoster67)

  • DE 690: Adding DE EL9 containers based on AlmaLinux9 (@vitodb)

  • DE 691: Adding Jenkinsfile for EL9 (@vitodb)

  • DE 689: Added new de-client metrics for duration (@IlyaBaburashvili)

  • DE 694: Decision Engine Components Data (@skylerfoster67)

  • DE 697: Redis Exporter Data (@skylerfoster67)

  • DE 703: Create Redis mock for unit tests (@shreyb)

  • DE 706: In Jenkins pipeline config use podman instead of docker (@vitodb)

  • DE 712: Rpm pip packaging with uv and pyproject.toml (@mambelli)

  • DE 714: Fixed spec file and added release script (@mambelli)

  • DE 715: Added wrapper to run the decisionengine commands also as root. Fixed installation glitches. (@mambelli)

  • DE 716: Added a check for the Python code being installed and improved Python code install (@mambelli)

  • DE 717: Added codespell in pre-commit and fixed files to compliance (@mambelli)

Full list of commits since version 2.0.2

34ae84f15: Added ability to build Python (wheel and sdist) packages to make-release.sh

8192524ba: Added codespell in pre-commit and fixed files to compliance

e045cb960: Added a check for the Python code being installed. Improved the Python install script, adding the ability to clone the repo

7093d5b12: Added wrapper to run the decisionengine commands also as root. Fixed installation glitches.

59cd65825: Fixed spec file and added release script

8174106dd: Adding dependencies RPM package and packaging with uv and pyproject.toml

22951dc58: Set /metrics Content-type header to text/plain

aea4fd7ad: logging fix for sources and channels

30fb83dc5: In Jenkins pipeline config use podman instead of docker

0aca4d926: Added .vscode/ to .gitignore

489d45abf: Dynamically check for redis server availability when a test is marked as redis

ff674229e: Remove unit test marker

6b1799f1b: Check if redis server is running. If not, skip integration test. Also marked applicable unit tests in this file

6879f72c5: Added unit marker for pytest

22488d720: For flake8 skip build folder

19a163a14: Fixed bugs in metrics with labels (#701)

0a38b170b: Update test to be compliant with flake8 linter

22162226c: Add licence info to flake8 config file

1da6bd8df: Re-enable flak8 linter

65d5ccd9a: Dashboard on Redis Exporter information

daaac113c: Added json files containing the dashboards for Source and Channel Data in decisionengine/dashboards

d21728923: Added license compliance for json files

ba29da2af: [pre-commit.ci] auto fixes from pre-commit.com hooks

d79908778: Configure buckets for de-client metrics

14826e839: [pre-commit.ci] auto fixes from pre-commit.com hooks

87457a1e0: Fix de-client –metrics description

2cc376a60: Added decorators for all rpc_ methods

7fe99a642: Added directory for Grafana dashboards

a9d1590c3: Adding Jenkinsfile for EL9

018c8ab47: Updated documentation for 1.7.5 release

ebfd6d4a0: Numpy security update requires >= 1.22 (not available for Python 3.6)

aa474a9d8: Fixed misspelling

9e9088418: Adding DE EL9 containers based on AlmaLinux9

438621a0a: Added two metrics on the Source - de_source_status and de_source_acquire_seconds

f54b92a59: Update GH actions

f842996df: enable rpmbuild_el7, run_flake8 and pytest_el7 tests only on DE branch 1.7

97915e0be: Note of possible future change

1c23be195: update regexp in _expected_circularity

aa12938f9: Updated instruction after testing PIP installation and pressure-based condifuraition on EL9

d302e74bf: Update to pytest 7 with pytest-postgresql 5

e9372585e: Make default database init use stronger data protections

aed2d7666: Disabling unit tests for python 3.6

87b3f6e92: Updated and tested instructions, PIP installation working

84ebaa528: Fixed setup for EL9

0053e5963: Added EL9 instructions

d23db48f2: Upgrading isort version to fix install error w/ poetry

fd3c5bda7: explicitly select a container with all our required python versions

85617b14d: fix source logging by defining logger in Sources correctly

669555cc7: Add docstrings.

1681f211d: Add timing information for when publisher was disabled.

33e4f699e: Test publisher status w/in publisher.

4d3d9100b: Some renaming.

Release 2.0.3

Skipping this tag

Release 2.0.2

This is mainly a bug fix and documentation release. Instructions to run on EL8 have been added. Also a UP/DOWN status metric was added via Prometheus.

Issues fixed in this release

  • 428 : Decision engine 1.7.3 bug too many open file descriptors in glide_frontend_element.py

  • 427 : Set CONTINUE_IF_NO_PROXY to False to allow hybrid configuration

Full list of commits since version 2.0.1

7ec132e9: [pre-commit.ci] auto fixes from pre-commit.com hooks

b942241a: Add installation instructions for CentOS 8

4f6fc134: [pre-commit.ci] auto fixes from pre-commit.com hooks

e8d1922e: Fix docstrings errors and warnings

fc6aefd5: Docker container and test setup for EL8

51d5293f: [pre-commit.ci] auto fixes from pre-commit.com hooks

0c15d3bd: Added UP/DOWN status metric of the decision engine

fc76a1f0: Fixup coverage for new version

04b18750: Set upper limit version for flake8. This is needed to have pytest-flake8 and flake8 versions working together.

98797411: Add ‘Setup pressure-based pilot submission’ section to install document

0165183c: make RPM requires more flexible

28e2a0d4: Updated release notes for 2.0.1 and porting of 1.7.3

Release 2.0.1

Patch level (bug fix) release.

Issues fixed in this release

Bugs fixed:

  • DE 639: de-client –status stalls whenever channels are not yet in STEADY state

  • DE 638: Sources should go offline if the client channel offline

  • DE 634: de-client –stop-channel / –start-channel doesn’t work in 2.0rc2

  • DE 626: New DE 2.0rc2 regularly takes 2-3 minutes to shut down

  • DE 599: Clarify timeout variable in block_while()

  • DE 522: Decision engine log files get split between several different processes with several different versions open

  • DE 236: New race condition in de-client

Enhancements:

  • DE 650: Added separate log files for Sources

Full list of commits since version 2.0.0

b5e56ab8: Remove signal handler.

0fb6814b: Prevent blocking (if possible) during service actions.

bb68fc31: Add logging handler to client-message receiver.

53fefbc5: Update kombu version.

009cdd95: Use kombu queues for server/client communication.

29a1ee25: add distinct logging for sources

e44e9210: Update GitHub actions; pylint workaround.

d192f8fb: Lock typing_extensions for Python 3.6 compat

2b946043: Fix pre-commit node version to 17.9.0, the last to support SL7.

76f3ddfb: lock pyupgrade to python3.6 support

c9c7cb3e: Include psutil as part of runtime requirements.

df8a3941: Make sure to kill worker process.

69924d0c: Do not block de-client calls during startup.

ddb18d7c: Minor cleanups.

f4dc7da7: Do not take source offline more than once during detach.

cbffa992: Update Docker entrypoint script for DE 2.0 branch

e10fe5af: Fixed cross-package link in the documentation

9da1eac8: Added cross-package link in the documentation

d278726b: Updated 2.0 release notes and indexes, ready for 2.0.0

Release 2.0.0

This release series follows 1.7. A lot started to happen in 1.7.0 and has happened since, so we felt it was proper to change the major version number. We are proud to introduce Decision Engine 2.0.0 to outside users: it provides a friendlier installation procedure and configuration samples to test it on all resources supported by the GlideinWMS Factory, like OSG, some HPC resources and commercial cloud providers.

  • New architecture with redesigned source system using Kombu message passing with a Redis backend.

  • Token support via DE modules: support for SciToken, WlcgToken (for CE authentication) and HTCondor Idtokens (for Glideins and Factory communication)

  • Separation from the GlideinWMS Frontend. Decision Engine still shares some libraries with GlideinWMS but you don’t need any more to install and configure the Frontend.

  • Structured logging. Improved python logging and adoption of structured logs format that will increase the semantinc content of the messages and ease the export of information for dashboards and Elastic Search.

  • Monitoring via Prometheus.

  • SQLAlchemy object-relational mapper to increase the testability of DB interactions and to allow different database backends.

  • Packaging via setuptools for both decisionengine and decisionengine_modules: Dependencies are not yet fully listed in the RPMs.

  • Added support of CentOS8 (RHEL7 is still out main platform)

  • Configuration example using HTC resources via GlideinWMS Factory

  • Decision Engine is distributed under the Apache 2.0 license

  • We increased our CI tests including also code auto-formatting and license compliance. We introduced integration tests and we are proud of our over 95% unit test coverage.

Note

SQLAlchemy is required and is now the only datasource backend supported. Upgrading from a different datasource backend (1.6 or earlier were using direct PostgreSQL, 1.7 was supporting both) is a one-way change with a migration tool. We suggest dropping all objects if you wish to reuse the tablespace. You can preserve a copy of the old database to query historical information.

Note

Added requirement on the Kombu library and a Redis server. We suggest to install Redis using a container.

Note

Added requirement on prometheus-client. Prometheus is be used as optional monitoring component.

Issues fixed in this release

  • 528: Update license and add copyright notices

  • 207: Under certain circumstances the fetch of the “consumes” information fails but the channel does not go offline operations

  • 547: Update DE client libs to pgsql-12

  • 459: Setuptools issues in decisionengine rpm

  • 546: Request CentOS8 Stream support for Decision Engine

  • 453: Struct Logging Self test errors with pytest-xdist

  • 418: Add auto-formatting of the code

  • 134: Yum update on decisionengine rpm doesn’t restart the service

  • 480: Request: Make postgresql migration script to migrate from old postgresql schema to new sqlalchemy schema

Full list of commits since version 1.7.0

685a3a8e: Added changelog file for developers curated list of changes

044f4463: Updated 1.7 and 2.0 release notes, ready for 2.0.0 RC4

19994fb5: Convert timeout program options to floats.

e2055f92: Address Marco’s review comments.

abdf35ad: Restore multiple queues but purge source queue after each publish.

52936cb5: Improve error-handling.

aad20744: Change to multiprocessing.Lock for protecting channel/source workers.

24bbee41: Adjust launching of source workers in attempt to avoid deadlock.

6d13a392: Remove unnecessary (and perhaps harmful) external updating of channel states.

5456f32f: Improve test coverage.

1afabb70: Use service_actions to disable sources whenever client channels fail.

7f67a172: Various naming and logging adjustments

e6e49184: Adjust de-client –status and add –product-dependencies program option.

a7c1f351: Apply block-while timeout to all channels, not each channel.

3d739ec7: Update ci workflow to include workflow_dispatch mechanism and to customize artifact file name

c5a05650: Archive unit test logs in case of unit test failure and make them available as artifacts

e94c2abb: Update Python 3.6-compatible pre-commit hooks.

aeb6b974: Update Countdown docstrings.

525eb3a8: Add Countdown class to address global timeout problem.

4c458e0c: Updated release notes for 2.0.0 RC3 (1.7.99.post3)

137b574a: Add a minimal container image more suited to production usage

9d7f6875: Provide de-client –queue-status program option.

a7dcc30d: Ensure that channels and sources shared the same queues.

49a316e0: Restore pyupgrade to v2.30, which works on Python 3.6

2ce5ccb6: [pre-commit.ci] pre-commit autoupdate

7bd41851: Print number of pickled bytes of source-produced data.

97aed846: Protect tests from Redis DB/routing-key collisions.

4d3abab7: Flush the Redis DB once the DE server stops.

e36c2150: Remove unnecessary @pytest.mark.usefixtures(…) decorations.

30d68610: More unit testing

7850995d: We should have one path where we test without -v.

a81a52cc: A simple test to ensure the metrics can run

7547720c: Logger tests are a bit unstable at high parallelization.

56516df7: Add missing test to ensure we can change the channel level twice

abde7d0f: Add missing tests for inherited functions

6522ed37: Note lines we are not testing

de7829a4: Remove the unit test log directory if it got created

28fbd599: pin jsonnet 0.17.0

9c5c827e: Metrics seems to want the channels setup to complete

b8829997: Pin pytest version

b348d6f7: Fix deadlock starting cherrypy metrics server

7697e6c1: Log invocation of random port

9e7e4813: Clarify note on xdist, run more workers

0b495fbf: Leave note to remember to cleanup temp files

ca5ddf6f: Ensure we are calling the cherrypy shutdown methods

e60efe78: Move metrics fixtures to the fixtures file

9c717cc5: Log finished with DB init

55965f9e: Prep the server fixture to permit the metrics webserver

732ff99b: Add a ‘ping’ method

6117cc95: [pre-commit.ci] pre-commit autoupdate

b5af73ca: More logging about cherrypy state

dfe4278f: Added unlinked release notes for DE 2.0.0

7d6484ad: Test source shared between two channels.

ae29d9d1: Test same source types, separate channels

6095d33f: Test LatestMessages utility.

dfbf3e06: Separate sources from channels.

2c10391e: Remove source proxy

afcc7cff: Add some more logging to try and trace startup state

dbd49a66: Explicitly pass .coveragerc to pytest.

e6b03216: Set max retry timeout for sqlite in unit tests.

51bed3d6: Updated documentation for 1.7.1 release

3829151f: Allow duplicate keys if their values are the same.

1ea288e0: [pre-commit.ci] pre-commit autoupdate

6b6611e5: Use pre-commit.ci rather than local actions

dedbe4bd: Use local time for structlog timestamp

461c506e: Make sure de_std.libsonnet is provided when packaged.

f93b5963: Update pre-commit hook versions and accommodate python-debian issue.

bba51609: Reduce number of fixtures.

a4510cb1: Segment the update for setuptools so it gets cached correctly

40098f35: Merge pull request #584 from jcpunk/user-pip

4e1b79a1: Merge pull request #583 from jcpunk/drop-dbutils

72c8db4a: Recommend using the site user pip dir instead.

ff604495: Drop unneeded module

b203e2c4: remove extraneous ‘import gc’

ee2278e7: replace needed import

4b7dedf2: add licensing info

e5a56816: add licensing info

a114abba: add licensing info

c2d511cd: adding queue logging to de_logger

77dd8d5a: Also run checks on backports to 1.7

7c029578: Updated developers instructions w/ license maintenance via REUSE information

e66b985d: Fix faulty tests.

d1a86c57: Set Apache 2.0 license and added REUSE compliance

e488030e: Ensure that redis is running.

6c982c11: Report PID for source process.

3f844ca4: Further flesh out the documentation

1d750001: Simplifications and rearrangements.

b85dca45: Set state to error for exceptions caught before the thread start.

c4727acb: Changed summaries to histograms in DecisionEngine and TaskManager modules

6fa0bf4d: Added install document and updated the index and development instructions accordingly

e4de391e: Do the build of the wheel as not-root per our requirements

24ba5272: Add a redis server to the CI testing containers

c939a6ed: Address Pat’s comments.

1925a7b0: First implementation using Kombu/redis to communicate data from sources to cycles.

82faa271: Don’t try to package obsolete sql file

9cbffe94: Drop redundant tests.

ab0de9a5: Drop obsolete raw postgresql interface

164b36d3: Removed unnecessary comment

91f7a76f: Fixed rebase errors

e475fbd5: Added import statement to fix MultiProcessCollector

a409f126: Add no-webserver setting to all DE Test Workers

39cca32e: Moved multiprocess import to metrics to clean up imports.

73762e90: Added –no-webserver to invocations of DEServer

303ee4be: Added __all__ global to control what is exported.

5170224b: Allow for metrics disabling from systemd unit file

2cacef4f: Added check for proper metrics environment and associated unit tests

a637a088: Make webserver operation configurable

b3d6445a: Changed set_to_function calls to set() calls for metrics

5dccc7fa: Changed metric names to match prometheus convention

7371c2e8: Added cherrypy requirement

2c511cea: Added metrics to record time to run Modules and DecisionEngine rpc calls

c24d33bc: Renamed prometheus.py to metrics.py

2335134d: Moved TaskManager metrics to util/prometheus.py to avoid duplicates

3c1b790c: Added metrics endpoint to RPC server, changed prometheus to multiprocess mode, and added CherryPy webserver for prometheus metrics

d8972de0: Added unit tests for metrics API

7b0f641b: Add instructions for running the Redis container.

8d0c4919: Block pytest-postgresql 4

d988f1a0: Lower timeout for actions.

4f920dcc: Simplifications in preparation for Kombu.

eb9f4292: Make TaskManager not executable

00c8f6e6: Remove unused files.

dd990d2c: Adding de-logparser, a tool to help parsing Decision Engine semi-structured logs

1da0d61e: Added a comment to help developers with incomplete installation

1cbc7334: Drop testing/support for PyPy

3ba3e8e6: Ignoring E203, whitespace after ‘:’, since black is adding the whitespace

814669d5: Disable PyPy test that fails for PG_DE_DB_WITH_SCHEMA fixture value.

8d68c287: Fix debug message

33db6425: Test composite workflows using source proxies and configuration combination.

30951a5b: EL7 doesn’t ship with a new enough golang for jsonnetfmt

e72eb3fd: Forbid inheritance from SourceProxy.

e95071fd: Automatically format jsonnet files with jsonnetfmt

355ccd45: Correct tests for python 3.10

64119161: Start testing python 3.10

c1cb8258: add dummy source and test

31b0f30b: Check for duplicate keys after source proxies have been removed.

140a4c47: Fix configuration-combination function signature.

d4a05299: Remove now unnecessary blocking.

1e78a889: Don’t run setup.py as root

a71d5b0a: Add error for running server as root

cd345701: Increase coverage in LogicEngine

6c132924: Fix out of sync devel requirements

8bfab003: Start running tests with xdist

aebe7d49: Remove unnecessary conversion to Pandas dataframe.

28919b16: Allow channels to boot in parallel.

c4fc5997: Improve parameter and variable names.

e021419b: Encourage use of automatic nag hook

cc4e469a: Update hooks to latest via pre-commit autoupgrade

cef30b69: Further simplify some cases

590bea3f: Add channel-combination facilities.

a4a7938c: Various simplifications recommended by flake8-simple

d5157416: Possible simplification to logging.

81e3d1ee: Fix pylint error on create_runner and ProcessingState

2a328c25: Added missing init file to make managers a package

d7f44015: Rework tests for #454

9521d3ce: Add debug statement when default logic-engine configuration is used.

c6dc778c: Unconditionally execute publishers with default configured logic engine.

a48dd7d8: Remove now-unnecessary Python-to-Jsonnet conversion.

a6a81ce7: Run autoformatters

49dac1ec: Setup pre-commit hooks for autoformatters

3800cc2a: Run the code style/standards checks early.

1d42eb0d: TaskManager now inherits from ComponentManager. Also added SourceManager, ChannelManager, and SourceSubscriptionManager files for future integration.

85a16f3b: Python optimised byte code removes assert under some conditions

bed2f5d9: Support latest setuptools_scm release