Source author record

Gürkan Solmaz

Gürkan Solmaz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Networking and Internet Architecture Computation and Language eess.SP eess.SY Machine Learning Systems and Control

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AgenticIE: An Adaptive Agent for Information Extraction from Complex Regulatory Documents

Declaration of Performance (DoP) documents, mandated by EU regulation, specify characteristics of construction products, such as fire resistance and insulation. While this information is essential for quality control and reducing carbon footprints, it is not easily machine readable. Despite content requirements, DoPs exhibit significant variation in layout, schema, and format, further complicated by their multilingual nature. In this work, we propose DoP Key Information Extraction (KIE) and Question Answering (QA) as new NLP challenges. To address this challenge, we design a domain-specific AgenticIE system based on a planner-executor-corresponder pattern. For evaluation, we introduce a high-density, expert-annotated dataset of complex, multi-page regulatory documents in English and German. Unlike standard IE datasets (e.g., FUNSD, CORD) with sparse annotations, our dataset contains over 15K annotated entities, averaging over 190 annotations per document. Our agentic system outperforms static and multimodal LLM baselines, achieving Exact Match (EM) scores of 0.396 vs. 0.342 (GPT-4o, +16%) and 0.314 (GPT-4o-V, +26%) across the KIE and QA tasks. Our experimental analysis validates the benefits of the agentic system, as well as the challenging nature of our new DoP dataset.

preprint2022arXiv

Label Augmentation with Reinforced Labeling for Weak Supervision

Weak supervision (WS) is an alternative to the traditional supervised learning to address the need for ground truth. Data programming is a practical WS approach that allows programmatic labeling data samples using labeling functions (LFs) instead of hand-labeling each data point. However, the existing approach fails to fully exploit the domain knowledge encoded into LFs, especially when the LFs' coverage is low. This is due to the common data programming pipeline that neglects to utilize data features during the generative process. This paper proposes a new approach called reinforced labeling (RL). Given an unlabeled dataset and a set of LFs, RL augments the LFs' outputs to cases not covered by LFs based on similarities among samples. Thus, RL can lead to higher labeling coverage for training an end classifier. The experiments on several domains (classification of YouTube comments, wine quality, and weather prediction) result in considerable gains. The new approach produces significant performance improvement, leading up to +21 points in accuracy and +61 points in F1 scores compared to the state-of-the-art data programming approach.

preprint2020arXiv

A Standard-based Open Source IoT Platform: FIWARE

The ever-increasing acceleration of technology evolution in all fields is rapidly changing the architectures of data-driven systems towards the Internet-of-Things concept. Many general and specific-purpose IoT platforms are already available. This article introduces the capabilities of the FIWARE framework that is transitioning from a research to a commercial level. We base our exposition on the analysis of three real-world use cases (global IoT market, analytics in smart cities, and IoT augmented autonomous driving) and their requirements that are addressed with the usage of FIWARE. We highlight the lessons learnt during the design, implementation and deployment phases for each of the use cases and their critical issues. Finally we give two examples showing that FIWARE still maintains openness to innovation: semantics and privacy.

preprint2020arXiv

Group-In: Group Inference from Wireless Traces of Mobile Devices

This paper proposes Group-In, a wireless scanning system to detect static or mobile people groups in indoor or outdoor environments. Group-In collects only wireless traces from the Bluetooth-enabled mobile devices for group inference. The key problem addressed in this work is to detect not only static groups but also moving groups with a multi-phased approach based only noisy wireless Received Signal Strength Indicator (RSSIs) observed by multiple wireless scanners without localization support. We propose new centralized and decentralized schemes to process the sparse and noisy wireless data, and leverage graph-based clustering techniques for group detection from short-term and long-term aspects. Group-In provides two outcomes: 1) group detection in short time intervals such as two minutes and 2) long-term linkages such as a month. To verify the performance, we conduct two experimental studies. One consists of 27 controlled scenarios in the lab environments. The other is a real-world scenario where we place Bluetooth scanners in an office environment, and employees carry beacons for more than one month. Both the controlled and real-world experiments result in high accuracy group detection in short time intervals and sampling liberties in terms of the Jaccard index and pairwise similarity coefficient.

Gürkan Solmaz

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

AgenticIE: An Adaptive Agent for Information Extraction from Complex Regulatory Documents

Label Augmentation with Reinforced Labeling for Weak Supervision

A Standard-based Open Source IoT Platform: FIWARE

Group-In: Group Inference from Wireless Traces of Mobile Devices