Date: 18-09-20  Time: 10:44 AM

Author Topic: Python 3.8, /dev/urandom, build failures  (Read 2701 times)

0 Members and 1 Guest are viewing this topic.

jmealins

  • New Member
  • *
  • Posts: 4
  • Karma: +0/-0
Python 3.8, /dev/urandom, build failures
« on: February 04, 2020, 12:29:05 AM »
Hello! Hoping to get your attention as you were referenced in a python core issue :)
https://bugs.python.org/issue36843

I recently tried my hand at building python 3.8 on an AIX 7.1 VM running on a power9 LPAR. I keep running into the exact same problem that is referenced in that python core issue, I can't seem to open /dev/urandom and read from it. I have checked the permissions, I have deleted the nodes and recreated them (randomctl -u/rm -rf/reboot), always fails to build.

I can't believe that reading from /dev/urandom is broken and it really smells like an environmental issue (buildling the .c file referenced in that issue in both xlc 13.1.6 and xlc 16.1.0.4 fails to open and read /dev/urandom). Clearly you didn't run into this issue when building on, might I asked if you made any configuration changes or patches in your aixtools build? At first I thought I was doing something wrong with xlclang.

I had run into issues like this in the past with LPARs and openssh, but this one has me flummoxed.

Thanks!

Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #1 on: February 05, 2020, 05:20:07 PM »
I'll have to take a look, more closely.
Something simple to verify:

aixtools@x064:[/home/aixtools]ls -l /dev/*random
crw-r--r--    1 root     system       34,  0 Jan 24 12:58 /dev/random
crw-r--r--    1 root     system       34,  1 Jan 24 12:58 /dev/urandom



Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #2 on: February 05, 2020, 06:20:50 PM »
As to how I build.
I use xlc and define CC as CC=xlc_r
Other than that, I have some include files added to /usr/include (to be able to link against some libraries provided by rpm.rte) and I add the argument "--without-computed-gotos"
With all environment variables "extra" it looks like:
/configure --prefix=/opt --sysconfdir=/var/python3/etc --sharedstatedir=/var/python3/com --localstatedir=/var/python3 --mandir=/usr/share/man --infodir=/opt/share/info/python3 --without-computed-gotos

As to testing the readability of /dev/urandom (or /dev/random) - a simple action is to use dd and pipe that to od
root@x065:[/data/prj/python/python3-3.9]dd if=/dev/random bs=8 count=1 | od -db
0000000    21283   25110   12861   13089
         123 043 142 026 062 075 063 041
1+0 records in.
1+0 records out.
0000010
root@x065:[/data/prj/python/python3-3.9]dd if=/dev/urandom bs=8 count=1 | od -db
0000000    30698   39438   59032   02624
         167 352 232 016 346 230 012 100
1+0 records in.
1+0 records out.
0000010

Also, would like to know the output of:
oslevel -s; nohup oslevel -s -q | head -2; lslpp -Lcq bos.mp64
Michael

Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #3 on: February 05, 2020, 07:17:02 PM »
p.s. I downloaded the urandom.c sample, compiled and ran it (non-root)

aixtools@x064:[/data/prj/aixtools/tests]make urandom        cc -O  urandom.c -o urandom
aixtools@x064:[/data/prj/aixtools/tests]./urandom
open O_RDONLY succeeded
read(16) -> 16
open O_RDONLY | O_CLOEXEC succeeded
read(16) -> 16
aixtools@x064:[/data/prj/aixtools/tests]id
uid=518(aixtools) gid=1955(aixtools) groups=1(staff),1954(felt)



AIX 7.1 TL4 SP6

jmealins

  • New Member
  • *
  • Posts: 4
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #4 on: February 05, 2020, 07:49:32 PM »
Thanks for responding! I posted almost everything that you ran as well in the following code block.

I wonder if my service level is too far behind.
Code: [Select]
bash-5.0$ ls -l /dev/*random
crw-r--r--    1 root     system       35,  0 Feb 05 09:58 /dev/random
crw-r--r--    1 root     system       35,  1 Feb 05 09:58 /dev/urandom
bash-5.0$ dd if=/dev/random bs=8 count=1 | od -db
0000000    36372   53208   01487   34366
         216 024 317 330 005 317 206 076
0000010
1+0 records in.
1+0 records out.
bash-5.0$ dd if=/dev/urandom bs=8 count=1 | od -db
0000000   
 64220   30892   38531   24531
         372 334 170 254 226 203 137 323
1+0 records in.
1+0 records out.
0000010
bash-5.0$ oslevel -s; nohup oslevel -s -q | head -2; lslpp -Lcq bos.mp64
7100-04-00-0000
Known Service Packs
-------------------
bos.mp64:bos.mp64:7.1.4.0: : :C: :Base Operating System 64-bit Multiprocessor Runtime : : : : : : :1:0:/:1543
bash-5.0$ xlc_r -O urandom.c -o urandom
bash-5.0$ ./urandom
open O_RDONLY failed
open O_RDONLY | O_CLOEXEC failed
bash-5.0$ id
uid=204(jmealins) gid=1(staff)
bash-5.0$ xlc_r
/opt/IBM/xlC/13.1.3/bin/.orig/xlc_r: 1501-294 (S) No input file specified. Please use -qhelp for more information.

jmealins

  • New Member
  • *
  • Posts: 4
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #5 on: February 05, 2020, 07:55:28 PM »
running with your exact configure command it fails in the same place everytime...
export CC=xlc_r
configure...
make

Code: [Select]
Fatal Python error: _Py_HashRandomization_Init: failed to get random numbers to initialize Python
Python runtime state: preinitialized

generate-posix-vars failed
make: *** [Makefile:592: pybuilddir.txt] Error 1

fails for as both root and non-root. only differences I can see between what you posted and what I posted are the mknod numbers for the /dev/random and /dev/urandom (35 vs 24) not sure that matters though.

Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #6 on: February 06, 2020, 06:45:02 AM »
Actually, considering that you mentioned that this is running on a POWER9 - which did not exist in 2015 - it may be that your system level is not supported and this is just one of the unusual surprises.
So, it is not the actual major device number (looking in my sandbox I see values 31, 32, 33, 34 and 44).
Major numbers link it to a kernel driver.
What surprises me is that dd is able to open /dev/random and /dev/urandom but a simple call by open() cannot.
a) you should try and update to any service pack. That your OS-level says 6100-04-00-0000 means it is base level TL level. SP01 basically came out at the same time as the base level and should have also been applied.

Known Service Packs-------------------
7100-04-07-1845
7100-04-06-1806
7100-04-05-1720
7100-04-04-1717
7100-04-03-1642
7100-04-02-1614
7100-04-01-1543
7100-04-00-0000

Note: there is also a service pack 08, released in 2019: 7100-04-08-1914, and the latest 7100 is 7100-05-05-1939
p.s. - going to make some changes to urandom.c and write a new program using fopen() to try and better understand what is happening (or not happening).


Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #7 on: February 06, 2020, 07:06:41 AM »
New program - better error reporting:
Code: [Select]
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>

int main()
{
    char data[16];
    ssize_t n;
    int fd;

    errno = 0;
    fd = open("/dev/urandom", O_RDONLY);
    if (fd >= 0) {
        printf("open O_RDONLY succeeded\n");
        n = read(fd, data, sizeof(data));
        printf("read(%zd) -> %zi\n", sizeof(data), n);
        close(fd);
    }
    else {
        printf("open O_RDONLY failed with errno:%d:%s\n", errno, strerror(errno));
    }

    errno = 0;
    fd = open("/dev/urandom", O_RDONLY | O_CLOEXEC);
    if (fd >= 0) {
        printf("open O_RDONLY | O_CLOEXEC succeeded\n");
        n = read(fd, data, sizeof(data));
        printf("read(%zd) -> %zi\n", sizeof(data), n);
        close(fd);
    }
    else {
        printf("open O_RDONLY | O_CLOEXEC failed with errno:%d:%s\n", errno, strerror(errno));
    }
}
New program (use fopen())
Code: [Select]
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main()
{
    char data[16];
    ssize_t n;
    FILE *f1;

    f1 = fopen("/dev/urandom", "r");
    if (f1 != NULL) {
        printf("fopen \"r\" succeeded\n");
        n = fread(data, sizeof(data), sizeof(char), f1);
        printf("fread:requested (%zd) -> received(%zi)\n", sizeof(data), n);
        fclose(f1);
    }
    else {
        perror("fopen failed\n");
    }
}

Michael

  • Administrator
  • Hero Member
  • *****
  • Posts: 1266
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #8 on: February 06, 2020, 07:59:05 AM »
 :) - got 7100-04-00-0000 installed.
There is a bug in that release. It seems O_CLOEXEC is included in the fcntl.h include file, but not (yet) supported by the open() call.
So, Python, during config sees the definition and includes it in the defines, and then builds. However, during execution, it fails.
Examples:
Code: [Select]
root@x068:[/data/prj/aixtools/tests/random]oslevel -s
7100-04-00-0000

root@x068:[/data/prj/aixtools/tests/random]find /usr/include -name \*.h | xargs grep O_CLOEXEC
/usr/include/fcntl.h:/* Currently as all 32 bits were used by _F flags, O_CLOEXEC and O_NOFOLLOW was defined as 64 bit.
/usr/include/fcntl.h: * Open function takes int as parameter hence we can't use 64bit O_CLOEXEC and O_NOFOLLOW for open.
/usr/include/fcntl.h: * O_CLOEXEC and _FCLREAD to define O_NOFOLLOW as there is no corresponding O_ flag for _FDEFERIND and _FCLREAD.
/usr/include/fcntl.h:#define O_CLOEXEC       _FDEFERIND      /* sets FD_CLOEXEC on open      */

root@x068:[/data/prj/aixtools/tests/random]./urandom
open O_RDONLY succeeded
read(16) -> 16
open O_RDONLY | O_CLOEXEC failed with errno:22:Invalid argument
So, I expect a quickfix will be to edit fcntl.h and run configure again, OR
maybe it is enough to just edit fcntl.h and comment out O_CLOEXEC.
The code in Python (that is correct!!) is in:
Python/fileutils.c:

Code: [Select]
#ifdef MS_WINDOWS
    flags |= O_NOINHERIT;
#elif defined(O_CLOEXEC)
    atomic_flag_works = &_Py_open_cloexec_works;
    flags |= O_CLOEXEC;
#else
    atomic_flag_works = NULL;
#endif
So, since I do not have a compiler installed on my 7100-04-00-0000 I cannot test myself, but I expect all will be resolved after changing /usr/include/fnctl.h to:
Code: [Select]
root@x068:[/usr/include]diff -u fcntl.h.orig fcntl.h
--- fcntl.h.orig        2015-06-01 14:18:55.000000000 +0000
+++ fcntl.h     2020-02-06 07:56:00.000000000 +0000
@@ -249,9 +249,9 @@
   * Open function takes int as parameter hence we can't use 64bit O_CLOEXEC and O_NOFOLLOW for open.
   * instead of changing _FCLOEXEC and _FNOFOLLOW to 32 bit we have decided to use _FDEFERIND to define
   * O_CLOEXEC and _FCLREAD to define O_NOFOLLOW as there is no corresponding O_ flag for _FDEFERIND and _FCLREAD.
+ #define O_CLOEXEC       _FDEFERIND      /* sets FD_CLOEXEC on open      * /
+ #define O_NOFOLLOW      _FCLREAD        /* do not follow symlinks       * /
   */
- #define O_CLOEXEC       _FDEFERIND      /* sets FD_CLOEXEC on open      */
- #define O_NOFOLLOW      _FCLREAD        /* do not follow symlinks       */

  /* In AIX TTY kernel extension, The terminal parameters are automatically
   * set to default values on first open. Hence O_TTY_INIT will be defined as zero.
Happy Hunting!


jmealins

  • New Member
  • *
  • Posts: 4
  • Karma: +0/-0
Re: Python 3.8, /dev/urandom, build failures
« Reply #9 on: February 06, 2020, 10:14:53 PM »
huzzah! It definitely was using the base release of 7.1, upgrading to TL4 SP6 fixed everything.
Code: [Select]
bash-5.0$ oslevel -s
7100-04-06-1806
bash-5.0$ ./betterurandom
fopen "r" succeeded
fread:requested (16) -> received(1)
bash-5.0$ ./urandom
open O_RDONLY succeeded
read(16) -> 16
open O_RDONLY | O_CLOEXEC succeeded
read(16) -> 16
bash-5.0$
I forgot that we installed the oldest version of 7.1 we could find and not what was currently supported by IBM for building this. Thanks for all your help and effort, the community of folks who build things on AIX is quite small :)

I have been having quite the adventure getting things to build in the new xlC 16.1 compiler with xlclang and xlclang++. Seems like there are even *fewer* people using that, but at least there is c++ 11 support finally!