Skip to content

No pexpect#31

Merged
javierggt merged 13 commits intomasterfrom
no-pexpect-2
Mar 13, 2025
Merged

No pexpect#31
javierggt merged 13 commits intomasterfrom
no-pexpect-2

Conversation

@javierggt
Copy link
Copy Markdown
Contributor

@javierggt javierggt commented Jan 19, 2025

Description

This PR removes uses of pexpect in ska_shell.

It also adds a check argument to several functions. check=True causes an exception if any command returns a non-zero code. check=False runs all commands even if one fails and no exception is raised. getenv does not have the check argument. The current behavior in master varies, but I have seen cases where it does not raise an exception (even though the docstring says it should).

It also does some small unrelated changes:

  • fix the docstring of bash_shell and tcsh_shell
  • ruff. Sorry this might make it hard to evaluate. I can back that commit out if you prefer.
  • renamed a function called test_error which for some reason triggered a failure when running testr. I don't know why this did not fail before and failed now, but it does not hurt to just change the name.

Notes

A small complicating issue was that the tcsh call was always loading .cshrc, and that was overwriting the PATH given as input. To sidestep this, the call to tcsh is wrapped in a call to bash, and then bash calls tcsh with the -f option.

This PR was motivated by warnings appearing in tests that were caused by pexpect (mica and agasc).

An alternative is #30, where I only removed uses of pexpect in the getenv function.

Interface impacts

  • added check argument to run_shell, bash, bash_shell, tcsh, tcsh_shell
  • added a feature to pass a logger to run_shell. The logger is called continuously, so you get a log as the program progresses. This is useful for long-running processes. I used this in runasp.

Testing

Unit tests

  • Linux
(ska3-flight-2025.1rc1) jgonzale ska_shell $ pytest ska_shell
========================================================= test session starts =========================================================
platform linux -- Python 3.12.8, pytest-8.3.4, pluggy-1.5.0
rootdir: /proj/sot/ska/jgonzalez/git
configfile: pytest.ini
plugins: anyio-4.7.0, dash-2.18.2, timeout-2.3.1
collected 25 items                                                                                                                    

ska_shell/tests/test_shell.py .........................                                                                         [100%]

========================================================= 25 passed in 3.35s ==========================================================
(ska3-flight-2025.1rc1) jgonzale ska_shell $ git rev-parse HEAD
3215cf2bcf19004be0033e5b7477eeba614c8da4

Independent check of unit tests by Jean

  • Linux
ska3-jeanconn-fido> git rev-parse HEAD
76dd49b26da847b59e5efe17daaf42e010fef2fb
ska3-jeanconn-fido> pytest
============================= test session starts ==============================
platform linux -- Python 3.12.8, pytest-8.3.4, pluggy-1.5.0
rootdir: /proj/sot/ska/jeanproj/git
configfile: pytest.ini
plugins: anyio-4.7.0, timeout-2.3.1
collected 25 items                                                                                                                 

ska_shell/tests/test_shell.py .........................                                                                      [100%]

======================================================== 25 passed in 3.93s

Functional tests

import logging
from ska_shell import run_shell, getenv

logging.basicConfig(level="INFO")
logger = logging.getLogger("name")

cmds = """python -c 'import time; time.sleep(2); print("here")'
python -c 'import time; time.sleep(2); print("here")'
python -c 'import time; time.sleep(2); print("here")'
"""

run_shell(cmds, logger=logger)

After running the tests linked in the description, I installed this branch into that environment and ran

run_testr --include mica --include agasc

and the warning in agasc went away (not the one in kadi).

JC ran the aimpoint code that calls dmcoords and it did not fail

In [1]: run aimpoint_mon/update_observed_aimpoints.py --data-root test
2025-03-05 08:28:16,503 Getting 30808 ACIS-S from archive
2025-03-05 08:28:19,375 Obsid=30808 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.3 dy=-0.9 dr=2.4
2025-03-05 08:28:19,383 Getting 30805 ACIS-I from archive
2025-03-05 08:28:21,602 Obsid=30805 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.4 dy=3.4 dr=4.1
2025-03-05 08:28:21,609 Getting 30374 ACIS-S from archive
2025-03-05 08:28:23,809 Obsid=30374 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-4.3 dy=-1.2 dr=4.4
2025-03-05 08:28:23,817 Getting 28612 ACIS-I from archive
2025-03-05 08:28:25,978 Obsid=28612 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-1.8 dy=3.3 dr=3.8
2025-03-05 08:28:25,985 Getting 30809 ACIS-S from archive
2025-03-05 08:28:28,226 Obsid=30809 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-3.0 dy=-2.1 dr=3.7
2025-03-05 08:28:28,233 Getting 28630 ACIS-I from archive
2025-03-05 08:28:30,398 Obsid=28630 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-1.7 dy=2.8 dr=3.3
2025-03-05 08:28:30,406 Getting 30376 ACIS-S from archive
2025-03-05 08:28:32,612 Obsid=30376 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.7 dy=-1.4 dr=3.1
2025-03-05 08:28:32,620 Getting 30801 ACIS-I from archive
2025-03-05 08:28:34,809 Obsid=30801 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-1.0 dy=2.3 dr=2.5
2025-03-05 08:28:34,818 Getting 28532 ACIS-S from archive
2025-03-05 08:28:37,124 Obsid=28532 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.6 dy=-1.7 dr=3.1
2025-03-05 08:28:37,132 Getting 30803 ACIS-I from archive
2025-03-05 08:28:39,334 Obsid=30803 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-1.9 dy=2.4 dr=3.1
2025-03-05 08:28:39,342 Getting 28099 ACIS-S from archive
2025-03-05 08:28:41,852 Obsid=28099 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-3.6 dy=-2.6 dr=4.5
2025-03-05 08:28:41,859 Getting 30799 ACIS-S from archive
2025-03-05 08:28:44,386 Obsid=30799 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.7 dy=-1.9 dr=2.6
2025-03-05 08:28:44,395 Getting 30804 ACIS-I from archive
2025-03-05 08:28:46,599 Obsid=30804 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.4 dy=3.8 dr=4.5
2025-03-05 08:28:46,606 Getting 30810 ACIS-S from archive
2025-03-05 08:28:48,851 Obsid=30810 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-4.6 dy=-1.0 dr=4.7
2025-03-05 08:28:48,859 Getting 30802 ACIS-I from archive
2025-03-05 08:28:51,033 Obsid=30802 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-3.1 dy=3.9 dr=5.0
2025-03-05 08:28:51,040 Getting 29771 ACIS-S from archive
2025-03-05 08:28:53,194 Obsid=29771 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-7.5 dy=-9.3 dr=11.9
2025-03-05 08:28:53,203 Getting 28582 ACIS-S from archive
2025-03-05 08:28:55,697 Obsid=28582 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-3.7 dy=-3.1 dr=4.8
2025-03-05 08:28:55,707 Getting 29831 ACIS-S from archive
2025-03-05 08:28:58,010 Obsid=29831 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.2 dy=-2.7 dr=3.5
2025-03-05 08:28:58,018 Getting 30800 ACIS-S from archive
2025-03-05 08:29:00,279 Obsid=30800 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.6 dy=-2.7 dr=3.1
2025-03-05 08:29:00,286 Getting 29740 ACIS-I from archive
2025-03-05 08:29:02,450 Obsid=29740 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.2 dy=1.5 dr=2.7
2025-03-05 08:29:02,458 Getting 30551 HRC-S from archive
2025-03-05 08:29:06,687 Obsid=30551 detector=HRC-S  chipx=2195.0 chipy=8915.0 dx=-1.3 dy=4.8 dr=5.0
2025-03-05 08:29:06,695 Getting 28560 ACIS-S from archive
2025-03-05 08:29:08,940 Obsid=28560 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-4.8 dy=-2.9 dr=5.6
2025-03-05 08:29:08,948 Getting 28132 ACIS-S from archive
2025-03-05 08:29:11,625 Obsid=28132 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-4.1 dy=-2.1 dr=4.6
2025-03-05 08:29:11,633 Getting 30587 ACIS-I from archive
2025-03-05 08:29:14,375 Obsid=30587 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-0.7 dy=2.1 dr=2.2
2025-03-05 08:29:14,382 Getting 30239 ACIS-I from archive
2025-03-05 08:29:16,842 Obsid=30239 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.0 dy=2.1 dr=2.9
2025-03-05 08:29:16,852 Getting 30823 ACIS-I from archive
2025-03-05 08:29:19,115 Obsid=30823 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-1.3 dy=2.3 dr=2.7
2025-03-05 08:29:19,123 Getting 30820 ACIS-I from archive
2025-03-05 08:29:21,669 Obsid=30820 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-1.8 dy=2.8 dr=3.3
2025-03-05 08:29:21,676 Getting 30278 ACIS-I from archive
2025-03-05 08:29:25,402 Obsid=30278 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.4 dy=2.7 dr=3.6
2025-03-05 08:29:25,409 Getting 30553 HRC-S from archive
2025-03-05 08:29:29,481 Obsid=30553 detector=HRC-S  chipx=2195.0 chipy=8915.0 dx=-0.2 dy=1.3 dr=1.3
2025-03-05 08:29:29,488 Getting 26975 ACIS-I from archive
2025-03-05 08:29:31,784 Obsid=26975 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.0 dy=-0.3 dr=2.0
2025-03-05 08:29:31,791 Getting 30821 ACIS-I from archive
2025-03-05 08:29:33,962 Obsid=30821 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-1.8 dy=-0.9 dr=2.0
2025-03-05 08:29:33,970 Getting 30032 ACIS-S from archive
2025-03-05 08:29:36,272 Obsid=30032 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-0.6 dy=-1.3 dr=1.4
2025-03-05 08:29:36,280 Getting 30555 HRC-S from archive
2025-03-05 08:29:40,716 Obsid=30555 detector=HRC-S  chipx=2195.0 chipy=8915.0 dx=-0.4 dy=0.7 dr=0.8
2025-03-05 08:29:40,724 Getting 30284 ACIS-I from archive
2025-03-05 08:29:43,874 Obsid=30284 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-0.7 dy=0.6 dr=0.9
2025-03-05 08:29:43,881 Getting 29646 ACIS-S from archive
2025-03-05 08:29:46,133 Obsid=29646 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.6 dy=-1.0 dr=1.9
2025-03-05 08:29:46,141 Getting 29965 ACIS-S from archive
2025-03-05 08:29:48,253 Obsid=29965 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-3.1 dy=0.1 dr=3.1
2025-03-05 08:29:48,260 Getting 29884 ACIS-S from archive
2025-03-05 08:29:51,001 Obsid=29884 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.2 dy=-1.3 dr=1.8
2025-03-05 08:29:51,009 Getting 30824 ACIS-I from archive
2025-03-05 08:29:53,167 Obsid=30824 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.3 dy=1.5 dr=2.8
2025-03-05 08:29:53,175 Getting 30556 HRC-S from archive
2025-03-05 08:29:58,002 Obsid=30556 detector=HRC-S  chipx=2195.0 chipy=8915.0 dx=-0.3 dy=2.7 dr=2.7
2025-03-05 08:29:58,009 Getting 30813 ACIS-S from archive
2025-03-05 08:30:00,282 Obsid=30813 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.6 dy=-2.8 dr=3.2
2025-03-05 08:30:00,292 Getting 29838 ACIS-I from archive
2025-03-05 08:30:02,699 Obsid=29838 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.1 dy=2.4 dr=3.2
2025-03-05 08:30:02,707 Getting 30814 ACIS-S from archive
2025-03-05 08:30:05,098 Obsid=30814 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.8 dy=-1.4 dr=3.2
2025-03-05 08:30:05,104 Getting 30285 ACIS-I from archive
2025-03-05 08:30:08,448 Obsid=30285 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.6 dy=3.3 dr=4.2
2025-03-05 08:30:08,456 Getting 30822 HRC-S from archive
2025-03-05 08:30:12,110 Obsid=30822 detector=HRC-S  chipx=2195.0 chipy=8915.0 dx=-0.7 dy=2.7 dr=2.8
2025-03-05 08:30:12,118 Getting 29647 ACIS-S from archive
2025-03-05 08:30:14,416 Obsid=29647 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.3 dy=-0.8 dr=1.5
2025-03-05 08:30:14,424 Getting 30819 ACIS-S from archive
2025-03-05 08:30:16,837 Obsid=30819 detector=ACIS-S chipx=193.7 chipy=520.0 dx=0.0 dy=-1.1 dr=1.1
2025-03-05 08:30:16,845 Getting 29704 ACIS-I from archive
2025-03-05 08:30:19,139 Obsid=29704 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-4.9 dy=0.5 dr=4.9
2025-03-05 08:30:19,155 Getting 30812 ACIS-S from archive
2025-03-05 08:30:21,668 Obsid=30812 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.4 dy=-1.6 dr=2.9
2025-03-05 08:30:21,676 Getting 29705 ACIS-I from archive
2025-03-05 08:30:23,954 Obsid=29705 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-0.0 dy=3.3 dr=3.3
2025-03-05 08:30:23,961 Getting 29706 ACIS-I from archive
2025-03-05 08:30:26,041 Obsid=29706 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-2.9 dy=2.3 dr=3.7
2025-03-05 08:30:26,048 Getting 30815 ACIS-S from archive
2025-03-05 08:30:28,282 Obsid=30815 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.5 dy=-1.9 dr=3.1
2025-03-05 08:30:28,289 Getting 30817 ACIS-S from archive
2025-03-05 08:30:30,529 Obsid=30817 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.5 dy=-2.6 dr=3.0
2025-03-05 08:30:30,537 Getting 29707 ACIS-I from archive
2025-03-05 08:30:32,638 Obsid=29707 detector=ACIS-I chipx=953.7 chipy=958.7 dx=0.4 dy=2.0 dr=2.0
2025-03-05 08:30:32,645 Getting 30818 ACIS-S from archive
2025-03-05 08:30:34,891 Obsid=30818 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.5 dy=-0.7 dr=1.7
2025-03-05 08:30:34,898 Getting 29708 ACIS-I from archive
2025-03-05 08:30:37,032 Obsid=29708 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-3.3 dy=0.7 dr=3.3
2025-03-05 08:30:37,039 Getting 30816 ACIS-S from archive
2025-03-05 08:30:39,375 Obsid=30816 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.2 dy=-1.3 dr=2.6
2025-03-05 08:30:39,387 Getting 28342 HRC-I from archive
2025-03-05 08:30:42,752 Obsid=28342 detector=HRC-I  chipx=7590.0 chipy=7745.0 dx=1.7 dy=3.4 dr=3.8
2025-03-05 08:30:42,760 Getting 29709 ACIS-I from archive
2025-03-05 08:30:44,890 Obsid=29709 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-5.0 dy=2.4 dr=5.5
2025-03-05 08:30:44,898 Getting 28562 ACIS-S from archive
2025-03-05 08:30:47,470 Obsid=28562 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.1 dy=-1.1 dr=2.4
2025-03-05 08:30:47,480 Getting 28152 ACIS-S from archive
2025-03-05 08:30:49,639 Obsid=28152 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.9 dy=-1.2 dr=2.2
2025-03-05 08:30:49,646 Getting 30830 HRC-I from archive
2025-03-05 08:30:52,745 Obsid=30830 detector=HRC-I  chipx=7590.0 chipy=7745.0 dx=1.1 dy=4.1 dr=4.2
2025-03-05 08:30:52,752 Getting 29710 ACIS-I from archive
2025-03-05 08:30:54,841 Obsid=29710 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-1.9 dy=1.5 dr=2.5
2025-03-05 08:30:54,848 Getting 30831 ACIS-S from archive
2025-03-05 08:30:57,067 Obsid=30831 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.4 dy=-1.4 dr=2.8
2025-03-05 08:30:57,076 Getting 30827 ACIS-S from archive
2025-03-05 08:30:59,610 Obsid=30827 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-1.9 dy=-2.1 dr=2.8
2025-03-05 08:30:59,622 Getting 29711 ACIS-I from archive
2025-03-05 08:31:02,172 Obsid=29711 detector=ACIS-I chipx=953.7 chipy=958.7 dx=-3.5 dy=2.1 dr=4.1
2025-03-05 08:31:02,180 Getting 30832 ACIS-S from archive
2025-03-05 08:31:04,695 Obsid=30832 detector=ACIS-S chipx=193.7 chipy=520.0 dx=-2.8 dy=-1.9 dr=3.4
2025-03-05 08:31:04,698 Writing test/observed_aimpoints.dat
2025-03-05 08:31:06,877 Writing plot files test/observed_aimpoints.png
2025-03-05 08:31:11,161 Writing plot files test/observed_aimpoints.pdf

@javierggt javierggt changed the base branch from master to no-pexpect January 19, 2025 19:11
@javierggt javierggt changed the base branch from no-pexpect to master January 20, 2025 14:49
@javierggt javierggt changed the title No pexpect 2 No pexpect Jan 20, 2025
@jeanconn jeanconn requested a review from Copilot January 28, 2025 17:56
@jeanconn
Copy link
Copy Markdown
Contributor

As a drop-in replacement for ska_shell, this didn't work with my local copy of runasp.py

2025-01-28 15:30:11,289 Tcsh-2025-01-28T15:30:11.28>
Traceback (most recent call last):
  File "/proj/sot/ska/jeanproj/git/runasp_wd/obs30739/../runasp.py", line 704, in <module>
    main(opt)
  File "/proj/sot/ska/jeanproj/git/runasp_wd/obs30739/../runasp.py", line 689, in main
    run_ai(range_ais)
  File "/proj/sot/ska/jeanproj/git/runasp_wd/obs30739/../runasp.py", line 399, in run_ai
    loglines = tcsh_shell("punlearn asp_l1_std",
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fido.real/miniforge3/envs/test/lib/python3.12/site-packages/Ska/Shell/shell.py", line 238, in tcsh_shell
    outlines, newenv = run_shell(
                       ^^^^^^^^^^
  File "/fido.real/miniforge3/envs/test/lib/python3.12/site-packages/Ska/Shell/shell.py", line 153, in run_shell
    raise exc

So I think we'd need to finish sot/runasp#8 or just add some test cases that are doing the things that are in the current runasp.py.

@javierggt
Copy link
Copy Markdown
Contributor Author

I think we should make this work with runasp.

It's not clear to me from the error what is the problem.

@jeanconn
Copy link
Copy Markdown
Contributor

It looked to me like it just didn't run the command with the PATH from the ascds_env to find punlearn. But I think your changes for runasp made sense anyway with regard to improved logging too, so maybe a compromise PR in runasp?

@javierggt
Copy link
Copy Markdown
Contributor Author

it seems like the exception is raised because the command returned a non-zero code. Note that the original docstring says:

If exit status is non-zero at any point then processing is terminated and a ShellError exception is raise.

So that would be the original intended behavior anyway. The question is: is the punlearn asp_l1_std command really returning non-zero? If not, then what is?

@javierggt
Copy link
Copy Markdown
Contributor Author

javierggt commented Jan 28, 2025

The current changes in this PR can be used instead of that PR in runasp (that was my intention). The functional test in this PR is the way I would use it in runasp.

Still, I did not expect the runasp master to fail. It shouldn't, unless there is some issue in master that I inadvertently fixed.

@jeanconn
Copy link
Copy Markdown
Contributor

It looks like my error trace copy-paste didn't have the Command not founds

2025-01-28 16:21:37,022 punlearn: Command not found.
2025-01-28 16:21:37,022 Tcsh-2025-01-28T16:21:37.02>
Traceback (most recent call last):
  File "/proj/sot/ska/jeanproj/git/runasp_re/obs30739/../runasp.py", line 695, in <module>
    main(opt)
  File "/proj/sot/ska/jeanproj/git/runasp_re/obs30739/../runasp.py", line 680, in main
    run_ai(range_ais)
  File "/proj/sot/ska/jeanproj/git/runasp_re/obs30739/../runasp.py", line 396, in run_ai
    loglines = tcsh_shell("punlearn asp_l1_std",
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fido.real/miniforge3/envs/test/lib/python3.12/site-packages/Ska/Shell/shell.py", line 238, in tcsh_shell
    outlines, newenv = run_shell(
                       ^^^^^^^^^^
  File "/fido.real/miniforge3/envs/test/lib/python3.12/site-packages/Ska/Shell/shell.py", line 153, in run_shell
    raise exc
Ska.Shell.shell.ShellError: Shell error: punlearn: Command not found.

This is straight from master.

@jeanconn
Copy link
Copy Markdown
Contributor

ascds_env = ska_shell.getenv('source /home/ascds/.ascrc -r release', shell='tcsh')

In [4]: ascds_env["PATH"]
Out[4]: '/home/ascds/DS.release/bin:/soft/SYBASE16.0/OCS-16_0/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/usr/bin/X11:/usr/X11R6/bin:/soft/tex/bin'

In [5]: ska_shell.tcsh_shell("plist asp_l1_std", env=ascds_env)
---------------------------------------------------------------------------
ShellError                                Traceback (most recent call last)
Cell In[5], line 1
----> 1 ska_shell.tcsh_shell("plist asp_l1_std", env=ascds_env)

File ~/git/ska_shell/ska_shell/shell.py:238, in tcsh_shell(cmdstr, logfile, importenv, getenv, env, logger, log_level)
    221 def tcsh_shell(cmdstr, logfile=None, importenv=False, getenv=False, env=None, logger=None, log_level=None):
    222     """
    223     Run the command string ``cmdstr`` in a tcsh shell.  It can have
    224     multiple lines. If exit status is non-zero at any point
   (...)
    236     :rtype: (outlines, deltaenv)
    237     """
--> 238     outlines, newenv = run_shell(
    239         cmdstr,
    240         shell='tcsh',
    241         logfile=logfile,
    242         importenv=importenv,
    243         getenv=getenv,
    244          env=env,
    245         logger=logger,
    246         log_level=log_level
    247     )
    248     return outlines, newenv

File ~/git/ska_shell/ska_shell/shell.py:153, in run_shell(cmdstr, shell, logfile, importenv, getenv, env, logger, log_level)
    151     exc = ShellError(f"Shell error: {msg}")
    152     exc.lines = stdout
--> 153     raise exc
    155 newenv = {}
    156 if "__PRINTENV__" in stdout:

ShellError: Shell error: plist: Command not found.

@jeanconn
Copy link
Copy Markdown
Contributor

In [7]: ls -l /home/ascds/DS.release/bin/plist
-rwxr-xr-x 1 cm ccase 16112 Dec  9 16:14 /home/ascds/DS.release/bin/plist*

Copy link
Copy Markdown
Contributor

@jeanconn jeanconn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted, this didn't work in my limited testing.

@jeanconn
Copy link
Copy Markdown
Contributor

It looks like for me a minimal working reproducible example of the runasp test goes something like:

  1. Make a test environment on one of our linux machines and pip install the test ska_shell into it (PYTHONPATH doesn't work for the old code still using Ska.Shell)
  2. Source into the new environment (needs "production" environment to run arc5gl for runasp)
  3. Make a clean clone of runasp and run a test obsid (this is the one that failed for me though I assume all fail the same).
cd /tmp
git clone git@github.com:sot/runasp
cd runasp
mkdir obs30739
cd obs30739
python ../runasp.py --obsid 30739

And that fails with

2025-02-19 14:54:05,815 punlearn: Command not found.
2025-02-19 14:54:05,816 Tcsh-2025-02-19T14:54:05.81>
Traceback (most recent call last):
  File "/tmp/runasp/obs30739/../runasp.py", line 695, in <module>
    main(opt)
  File "/tmp/runasp/obs30739/../runasp.py", line 680, in main
    run_ai(range_ais)
  File "/tmp/runasp/obs30739/../runasp.py", line 396, in run_ai
    loglines = tcsh_shell("punlearn asp_l1_std",
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fido.real/miniforge3/envs/test/lib/python3.12/site-packages/Ska/Shell/shell.py", line 238, in tcsh_shell
    outlines, newenv = run_shell(
                       ^^^^^^^^^^
  File "/fido.real/miniforge3/envs/test/lib/python3.12/site-packages/Ska/Shell/shell.py", line 153, in run_shell
    raise exc
Ska.Shell.shell.ShellError: Shell error: punlearn: Command not found.

@javierggt
Copy link
Copy Markdown
Contributor Author

I cannot reproduce your error. I get a different failure, and it does not look like a shell issue.

I followed your instructions and created an environment in /export/jgonzale/miniconda3/envs/ska-shell which has the no-pexpect-2 branch of ska_shell installed. Then I did

rm -fr /proj/sot/ska/jgonzalez/git/runasp/obs30739
mkdir /proj/sot/ska/jgonzalez/git/runasp/obs30739
cd /proj/sot/ska/jgonzalez/git/runasp/obs30739
python ../runasp.py --obsid 30739

I can clearly see the punlearn command succeed multiple times. The line you mention is the very first command in run_ai and that succeeds for me. That kind of error points to a wrong environment, and should be reproducible with a smaller script.

In my case, it fails with:

2025-02-21 11:42:37,132 flt_run_pipe: flt_pctr -prof pipeline_out/ASP_L1_STD_853724356/out1/f853724356.ped -startat make_obc_solution -stopat check_star_data -showlog
2025-02-21 11:42:37,175 2025_052_16:42:37	Successfully started make_obc_solution
2025-02-21 11:42:39,176 2025_052_16:42:39	make_obc_solution completed successfully.
2025-02-21 11:42:39,177 2025_052_16:42:39	Successfully started get_star_data
2025-02-21 11:42:40,177 2025_052_16:42:40	get_star_data completed successfully.
2025-02-21 11:42:40,177 2025_052_16:42:40	Processing was requested to stop before check_star_data
2025-02-21 11:42:40,177 
2025-02-21 11:42:40,177 End Profile Execution
2025-02-21 11:42:40,178 
2025-02-21 11:42:40,178 Task-Specific Log Information:
2025-02-21 11:42:40,178 
2025-02-21 11:42:40,178 
2025-02-21 11:42:42,180 End Execution
2025-02-21 11:42:42,180 
2025-02-21 11:42:42,180 Parameter Settings:
2025-02-21 11:42:42,180 
2025-02-21 11:42:44,191 End Execution
2025-02-21 11:42:44,191 
2025-02-21 11:42:44,191 Parameter Settings:
2025-02-21 11:42:44,191 
2025-02-21 11:42:44,201 flt_run_pipe: ################################
2025-02-21 11:42:44,201 flt_run_pipe: Fatal ERROR - pctr Error = 80
2025-02-21 11:42:44,201 flt_run_pipe: ################################

It succeeds with ska_shell master, so whatever the issue is, it is introduced by this branch, but it is not obvious to me.

@jeanconn
Copy link
Copy Markdown
Contributor

Does your no-pexpect-2 branch match github's? This is my git rev

jeanconn-kady> git rev-parse --short HEAD
7f747b7

and the code in /export/jgonzale/miniconda3/envs/ska-shell/lib/python3.12/site-packages/ska_shell/shell.py also has diffs relative to the shell.py I've got.

@javierggt
Copy link
Copy Markdown
Contributor Author

javierggt commented Feb 21, 2025

Yes, it matches:

(ska-shell) jgonzale ska_shell $ git rev-parse --short HEAD
7f747b7

I ran the pipeline using ska_shell master and it succeeds. However, I do see this in the output:

2025-02-21 11:48:19,461 flt_run_pipe: ################################
flt_run_pipe: Fatal ERROR - pctr Error = 80
flt_run_pipe: ################################

This means flt_run_pipe returned the exit code 80 and the pipeline ignored it and continued running. I would argue that this is not the right behavior, and that the no-pexpect-2 branch does it right. A non-zero return code should fail.

I ran it ignoring the return code (this needs changes in ska_shell and runasp), and it ran to the end, but I don't know what failed and what succeeded.

@javierggt
Copy link
Copy Markdown
Contributor Author

javierggt commented Feb 21, 2025

I just pushed a change that adds an exception type that is used when raising exception if the return code is non-zero. Then in runasp I added try-except blocks to ignore it. Doing this, it ran to the end (but as I said, I would argue this is not correct).

@javierggt
Copy link
Copy Markdown
Contributor Author

@javierggt
Copy link
Copy Markdown
Contributor Author

javierggt commented Feb 21, 2025

The source of the problem is that ~/.cshrc is being sourced in the subprocess call, and PATH is overwritten as a result. I checked that .bashrc is not sourced, so I wrapped the tcsh calls in bash calls. Not pretty, but seems to work.

@javierggt
Copy link
Copy Markdown
Contributor Author

but of course this will fail if the given command has single quotes.

@javierggt
Copy link
Copy Markdown
Contributor Author

@jeanconn

After determining that the issue was the .cshrc and making the last commit, I could test different versions and reproduce your error.

Using this branch is it is currently, that error does not occur anymore. I did some tests with commands using single quotes and could not get it to fail, so I think I prefer this PR over the ska_helpers one.

The pipeline still fails with this error:

2025-02-25 12:03:06,638 flt_run_pipe: ################################
2025-02-25 12:03:06,638 flt_run_pipe: Fatal ERROR - pctr Error = 80
2025-02-25 12:03:06,638 flt_run_pipe: ################################
2025-02-25 12:03:06,639 Bash-2025-02-25T12:03:06.63>
Traceback (most recent call last):
  File "/proj/sot/ska/jgonzalez/git/runasp/obs30739/../runasp.py", line 695, in <module>
    main(opt)
  File "/proj/sot/ska/jgonzalez/git/runasp/obs30739/../runasp.py", line 680, in main
    run_ai(range_ais)
  File "/proj/sot/ska/jgonzalez/git/runasp/obs30739/../runasp.py", line 439, in run_ai
    tcsh_shell(first_pipe,
  File "/export/jgonzale/miniconda3/envs/ska-shell/lib/python3.12/site-packages/Ska/Shell/shell.py", line 247, in tcsh_shell
    outlines, newenv = run_shell(
                       ^^^^^^^^^^
  File "/export/jgonzale/miniconda3/envs/ska-shell/lib/python3.12/site-packages/Ska/Shell/shell.py", line 162, in run_shell
    raise exc
Ska.Shell.shell.NonZeroReturnCode: Shell error: flt_run_pipe: ################################

I checked that this error also happens when using the latest ska_shell release, but ska_shell does not bail out. I contend that this is an actual error, and ska_shell should fail in this case. One could add a check argument, like the subprocess functions do, but I would still make the default True.

@javierggt javierggt requested a review from jeanconn March 3, 2025 14:40
@jeanconn
Copy link
Copy Markdown
Contributor

jeanconn commented Mar 5, 2025

Hi @javierggt - would you please update the Description to document that you ran the unit tests?

@jeanconn
Copy link
Copy Markdown
Contributor

Also, for my use case in runasp it looks like I want to continue on error status 80. How can I get the status code of the last error from this code to make that decision?

@javierggt
Copy link
Copy Markdown
Contributor Author

javierggt commented Mar 11, 2025

How can I get the status code of the last error from this code to make that decision?

I didn't want to make that an option because it's not a good behavior. You could just run with check=False and that should give you the same result.

@jeanconn
Copy link
Copy Markdown
Contributor

Yes, but I have no control over that behavior unless I doubly-wrap the pipe manager. So how would I access the error code?

@javierggt
Copy link
Copy Markdown
Contributor Author

javierggt commented Mar 11, 2025

Right now you are not checking the return code behavior, but that runasp gives the same result as a pipeline that ran "successfully". Maybe you could use the branch in sot/runasp/pull/9?

I think this is something that should be fixed in runasp one way or another.

@jeanconn
Copy link
Copy Markdown
Contributor

Right, I mean I understand that previously nothing was being checked, but now that I've looked at the situations that return non-zero it seems like we could do better than that. Anyway, I simultaneously edited runasp to check most calls except ones that are going to end in code 80 and it is working for me.

@javierggt
Copy link
Copy Markdown
Contributor Author

I figured that just adding the check argument was "doing better". Strictly speaking, you do not need to know the last return code, and the current ska_shell interface does not permit that anyway. Using check=True, the shell command will raise an exception. What else would you like it to do?

@javierggt
Copy link
Copy Markdown
Contributor Author

One thing one could add is that the exception that is raised includes the error code. Right now it is just included in the message that says "Shell command %s failed with exit status %d".

@javierggt
Copy link
Copy Markdown
Contributor Author

sorry, my mistake. In this branch, the error code is not included in the message, but I could include it, and the return code could be a member of ShellError

@jeanconn
Copy link
Copy Markdown
Contributor

jeanconn commented Mar 11, 2025

I agree that adding the "check" is doing better. But I don't really agree with "Strictly speaking, you do not need to know the last return code," . I know that return codes are all over the place and generally you just need to stop on a non-zero value. But if I've identified a specific case where one value seems to be benign for a certain constructed cmd , I don't really see a way to handle that correctly without knowing the value (except to add a another shell script to wrap flt_run_pipe and conditionally mangle the exit status).

But you'll see in my PR on runasp that I just set check conditionally to not check for the case that is presently failing but let it check for all the rest.

@jeanconn
Copy link
Copy Markdown
Contributor

Regarding "but I could include it, and the return code could be a member of ShellError" that would be awesome! And you see from email traffic I'd also like to get the upstream flt_run_pipe code fixed if it is fixable.

@javierggt javierggt merged commit e63b8f0 into master Mar 13, 2025
@javierggt javierggt deleted the no-pexpect-2 branch March 13, 2025 15:59
@javierggt javierggt mentioned this pull request Jun 9, 2025
5 tasks
@javierggt javierggt review requested due to automatic review settings March 23, 2026 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants